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I.  INTRODUCTION  AND  SUMMARY 


One  important  class  of  problems  that  arises  in  applications  of  automatic  control  theory  is 
that  of  adaptive  control  of  a  dynamic  system.  When  suitably  reformulated  in  terms  of  an 
augmented  state  vector,  such  a  control  problem  becomes  one  of  controlling  a  composite 
nonlinear  system  having  some  static  or  slowly  varying  state  components,  usually  called 
parameters,  which  are  not  precisely  known  or  directly  observed.  Most  of  the  recent  work  in 
this  area  has  focused  on  proving  the  global  stability  of  control  laws.  Such  proofs  tend  to  be 
more  feasible  for  less  efficient  control  laws,  however,  and  efficiency  is  also  important  for 
practical  applications.  This  aspect  of  the  problem  can  be  investigated  by  analyzing  it  as  an 
optimal  control  problem  with  respect  to  an  explicit  performance  criterion.  This  approach  is 
pursued  here,  with  all  the  uncertainties  in  the  (original)  system  dynamics,  available 
measurements  and  parameter  values  formulated  as  random  variables. 

Exact  solutions  to  such  control  problems  are  typically  unobtainable,  so  consideration  is 
limited  here  to  certain  cases  in  which  these  uncertainties  are  small,  with  the  objective  of 
developing  asymptotic  approximations  of  optimal  control  laws  in  terms  of  the  perturbations 
induced  by  departures  of  the  random  variables  from  their  mean  values.  In  the  absence  of 
parameter  uncertainty,  the  limitingcase  of  small  random  variations  in  the  measurements 
and  system  dynamics  reduces  the  problem  under  fairly  general  conditions  to  solving  the 
deterministic  optimal  control  problem  that  results  from  all  the  random  variables  assuming 
their  mean  values  with  certainty  (which  is  a  relatively  easy  problem  and  not  examined 
here),  and  then  solving  a  stochastic  optimal  control  problem  in  the  (small)  perturbations  of 
the  state,  control,  and  measurement  variables  from  their  values  in  this  optimally  controlled 
deterministic  case,  where  this  stochastic  problem  has  linear  dynamics  and  state 
measurements  (in  the  perturbation  variables),  a  quadratic  performance  criterion,  and 
Gaussian  white  noise  added  to  the  dynamics  and  measurements  (Reference  I).  The  dynamics 
for  a  scalar  state  perturbation  variable  x  in  a  simple  case  of  this  sort  might  have  the  form 

x  =  /x  +  gu  +  w 

where  u  is  a  control  variable,  w  is  noise,  and  /  and  g  are  known  parameters. 

If  f  itself,  however,  is  a  function  of  some  variable  y,  say  an  aerodynamic  coefficient, 
which  is  not  precisely  known  in  advance,  it  can  be  expanded  in  a  Taylor  series  about  an 
average  value  y.  Truncating  this  expansion  at  second  order  turns  out  to  leave  a  useful  degree 
of  accuracy  here,  and  gives 


f*fCy)  +r(yXr-y)  +  \rG*y -y>*  • 

If  the  typical  size  ofy  —  y  is  k  (assumed  suitably  small),  then  a  new  variable 
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_y-y 

A 


can  be  defined  whose  typical  size  is  unity,  and  for  which 


f  =  f(y)  +  hf'(~y)Q  +  h2f’(y) 82  . 

Thus  we  are  led  to  consider  dynamics  of  the  following  form  for  the  adaptive  extension  of  this 
problem  with  uncertainty  in  the  value  of  f  : 


x  =  ~f  x  +  gu  +  ax9  +  $xQ2  +  w  , 


where  f,  g,  a  and  [3  are  known,  8  can  be  considered  another  state  variable  (static  or  slowly 
varying)  to  be  estimated  along  with  x,  a  is  of  order  A,  and  (3  is  of  order  A2.  A  variety  of 
considerations  of  this  sort  for  the  other  parameters  (such  as  g)  in  the  multivariate  case  of  the 
above  stochastic  optimal  control  problem  leads  to  the  general  problem  formulated  in  Section 
III,  which  is  the  actual  subject  of  this  report. 

An  approximation  of  an  optimal  control  law  is  then  developed  for  this  problem  which  is 
asymptotically  accurate  to  second  order  in  the  scaling  parameter  A  as  A  approaches  zero;  i.e., 
as  the  parameter  uncertainty  vanishes  (first-order  approximations  are  degenerate  here). 
This  approximately  optimal  control  law  is  specified  in  terms  of  a  finite-dimensional  initial- 
value  system  of  ordinary  differential  equations.  The  control  consists  of  generally  cubic 
feedback  on  the  current  conditional  means  of  the  dynamic  system  state  vector  and  a 
parameter  vector  (which  together  form  a  composite  state  vector  in  the  analysis),  and  linear 
feedback  on  the  current  conditional  cross-covariance  matrix  for  these  two  vectors.  The 
means  and  cross-covariance  matrix  are  generated  recursively  by  a  finite  measurement- 
driven  system  of  stochastic  differential  equations,  as  in  a  Kalman  filter. 

The  analysis  here  is  only  formal,  however,  e.g.,  it  is  assumed  without  rigorous 
justification  that  the  neglected  error  terms  are  sufficiently  "well-behaved”  that  all 
quantities  that  appear  to  be  of  higher  order  are  so  in  some  appropriate  sense.  Also,  many  of 
the  order-of-magnitude  arguments  are  made  as  if  the  (dynamic  system)  state  and  parameter 
vectors  were  both  scalars  without  explicitly  mentioning  this  at  the  time.  In  the  multivariate 
case  the  validity  of  these  arguments  might  also  depend  on  some  subtle  conditions  on  the 
dependency  structure  of  the  state  and  parameter  components,  but  such  questions  are  beyond 
the  scope  of  this  report.  For  these  reasons  it  is  important  to  augment  this  type  of  analysis  by 
testing  the  results  on  specific  numerical  examples.  One  such  example  is  included  here  and 
the  theory  appears  to  give  reasonable  and  meaningful  results  in  this  case. 

The  results  here  apply  directly  only  to  adaptive  control  problems  in  which  the  parameter 
uncertainties  are  relatively  small.  The  hope  is,  however,  that  these  results  could  provide 
some  useful  dues  for  constructing  more  efficient  control  schemes  under  more  general 
conditions,  even  though  the  design  process  itself  might  evolve  along  more  pragmatic  lines. 
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CI.  NOTATION 

Unless  otherwise  indicated,  lower  case  letters  denote  (real)  column  vectors  or  scalars. 
Matrices  are  denoted  by  capital  roman  letters.  AT  denotes  the  transpose  of  a  matrix  A.  If  A  is 
square,  lAl  denotes  its  determinant,  and  tr(A)  its  trace. 

It  will  be  convenient  to  make  rather  extensive  use  of  three-way  matrices,  which  are 
always  denoted  by  capital  Greek  letters  here.  For  continuity  of  notation,  the  following 
definitions  are  adopted  for  such  a  three-way  matrix  T,  with  vector  x  and  matrices  A  and  B  of 
compatible  dimensions,  and  with  repeated  indices  denoting  summation: 

(Fx)y  =  Fy0x0  (matrix) 

(Arr)ijk  =  AyXk  (three-way  matrix) 

(ADy*  =  AIO  rajk  (three-way  matrix) 

(FB)y*  =  TijoBgk  (three-way  matrix) 

(F')y*  =  Tjki  and  (rDy*  =  Tkji  (three-way  matrices) 

[tr(Dli  =  F 01<J  (column  vector,  when  applicable) 

T  is  called  symmetric  if  T  =  T'  =  F  =  Fr  . 

With  these  definitions,  the  expression  AFTBDxxT  is  fully  associative.  Many  other 
consequences  are  obvious.  Some  useful  but  less  obvious  properties  are: 

tr<rx)  =  [triDF 

A  tr(D  =  tr(Ar') 
titAD  =  tr(rA) 

(rB)'  =  BTT  and  FAT  =  (AD* 

(ATB)T  =  BTTTAT 

(rx)AT  =  (AD'x  and  (r*x)B  =  (rB)’x 

T  symmetric  •  (B^TBYB  and  A(ATA'rY  symmetric. 

Some  use  will  also  be  made  of  four-way  matrices,  which  are  denoted  by  capital  roman 
bold  letters  here.  The  following  operations  involving  four-way  matrices  are  likewise  defined: 

(A')ykm  =  A jkmi  (four-way  matrix) 

(AF)y*m  =  A^ki  (four-way  matrix) 
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ftr(A)]y  —  Agyg 
(A.A)yitm  =  AiaAajj,m 

(AB)ykm  =  ^-ijko^am 
(Ajc)yk  —  A  ijkoxo 

(X^yk  —  Agy/fXfj 

(rQ)ykm  — 

r ijo^akm 


(matrix,  when  applicable) 
(four-way  matrix) 
(four-way  matrix) 
(three-way  matrix) 
(three-way  matrix) 


(four-way  matrix) 

Analogously,  products  such  as  TA  are  occasionally  formed  by  summing  over  repeated 
adjacent  indices;  i.e.,  a  five- way  matrix  in  this  case  such  that 


and  the  trace  of  higher-way  matrices  is  always  obtained  by  summing  over  repeated  outer 
indices  so  that,  for  example, 

[tr(Bnjy*  —  T . 

Also,  outer  products  of  various  kinds  of  matrices  are  denoted  by  ® ,  so  that 
(A  ®  B)ykm  ~  A-yBkm  (four-way  matrix) 

or 

(r  ®  A)ykmn  =  VykAmn  .  (five-way  matrix) 

Parentheses  are  omitted  in  this  notation  if  the  order  of  association  is  immaterial  or  if  the 
interpretation  is  unambiguous. 

The  probability  density  function  of  a  (vector)  random  variable  x  is  denoted  by  px(  )  and 
the  corresponding  expectation  operator  by  Ez.  The  covariance  matrix  of  x  is  denoted  by 
cov(x).  When  the  meaning  is  clear  from  the  context,  p(x),  p(x/y),  and  E(x)  are  often  used  as 
abbreviations  for  p*(x),  Pxf/xj),  and  Ex(x). 


III.  BASIC  PROBLEM  AND  APPROACH 


In  order  to  include  various  multivariate  extensions  of  the  problem  described  in  the 
introduction,  and  also  to  include  the  possibility  of  adding  some  precisely  known  components 
to  the  motion  state  x,  we  consider  a  situation  in  which  finite-dimensional  real-valued  state 
vectors  x  and  0  have  the  dynamics 

x  =  Fx  -h  Gu  +■  2  trfre*r)  +  2tr<  Ve«r)  +  tr  (F00T)x  +  tr  (L00r)o  +  (/  +  S'e^w.  (1) 
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9  =  Tu  +  eu>2  ,  e  a  scalar 


(2) 


and  for  which  the  measurement  vector 


z  =  Hx  +  2KQ  +  2  tr(Q0xn  +  tr(H007,)x  +  tr(A00T)  +  u  (3) 


is  available  at  each  instant  t  >  to,  where  the  time  argument  t  of  these  variables  and  of  the 
coefficient  matrices,  which  also  may  be  time-varying,  is  suppressed  in  the  notation,  and 
where 

•  u  is  a  control  vector, 

•  w,  u>2,  and  v  are  independent  zero-mean  Gaussian  white  noise  processes  with  respective 
covariance  parameters  Q,  and  R,  which  may  be  time-varying, 

•  at  the  initial  time  to,  which  is  specified  a  priori,  xUq)  and  0(<o)  have  the  independent 
Normal  prior  distributions  N(x o,  f*o )  and  MO,  Lq),  respectively, 

•  e  and  the  components  of  K  and  of  the  three-way  matrices  T,  *P,  S,  and  Q  are 
approximately  infinitesimal,  say  of  order  A;  the  components  of  T, 2  and  of  all  the  four- 
way  matrices  are  of  order  A2;  no  other  quantities,  including  the  components  of  P0~l, 
Lq  ~ 1  and  ft-1,  are  very  large  compared  to  unity,  and 

•  F  =  Fr,  L  =  Lt,  H  =  Hr,  and  A  =  AT  (which  imposes  no  loss  of  generality). 

The  problem  is  to  find  a  feedback  control  law,  specifying  the  current  control  u(t)  as  a  function 
of  t  and  past  control  and  measurement  values,  which  minimizes  the  (scalar)  performance 
criterion 


♦Cttl* 


+  erGe)*  +  ut(b  +  0rMe)« 


+  QtZ  +  trT(A09r)  +  cr|u  +  £0r[ti<<J>xxr)  +  <*eUur)| )  dt  J, 
where  E  denotes  prior  expectation,  fyis  an  a  priori  specified  terminal  time,  and  for  all 

•  Sf  and  A(t)  are  positive-semidefinite 


(4) 
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•  Bit)  is  positive-definite 

0  no  component  of  S^.,  Ait),  Bit),  B~tit),  or  cit)  is  very  large  compared  to  unity 

•  the  components  of  Hf,  Zit),  4>(rt,  and  &(t)  are  of  order  A;  those  of  A,  Ait),  G  it)  and  M(f)  are 
of  order  A2 

Hf  =  hJ  ,  Mt)  =  ATit) ,  Qit)  =  <PTit) ,  0(0  =  er(0 , 

A  =  Ar ,  A'  =  A’r ,  G(0  =  Gr«) ,  G '(f)  =  G’rit) , 

Wlit)  =  MrU),  and  M'(0  =  M’r(t)  . 

The  vector  0  is  slowly  varying  and  is  meant  to  represent  normalized  uncertainties  in  the 
parameter  matrices  F,  G,  Q,  H,  Sf,  A,  and  B  in  what  would  otherwise  be  a  standard  linear- 
quadratic-Gaussian  (LQG)  optimal  control  problem  with  x  alone  as  the  state  vector.  It  is  also 
assumed  here  that  this  corresponding  LQG  problem,  i.e.,  with  all  the  three-  and  four-way 
matrices  zero,  is  such  that  the  mean  and  covariance  matrix  of  the  conditional  distribution  of 
xit),  and  also  the  latter’s  inverse,  remain  of  order  unity,  and  is  such  that  the  prior  expected 
value  of  xTx  remains  of  order  unity  under  the  well-known  optimal  control  law  for  this  case. 

The  actual  objective  here  is  only  to  obtain  an  approximation  of  an  optimal  control  law  for 
small  A.  If  A  were  zero,  of  course,  the  problem  would  reduce  to  the  corresponding  LQG  case 
just  mentioned,  for  which  the  exact  solution  is  known  to  be  a  linear  feedback  control  on  the 
current  conditional  mean  of  x,  which  in  turn  can  be  generated  recursively  from  the  incoming 
measurements  by  a  Kalman  filter.  Unlike  the  situation  described  in  Reference  2,  however, 
the  departures  from  this  linear  solution  are  all  determined  by  effects  of  order  h2  here, 
basically  because  the  small  nonlinear  terms  in  the  dynamics  and  measurements,  and  the 
small  cubic  and  quantic  terms  in  the  criterion,  all  involve  components  of  the  vector  6,  about 
which  no  information  could  be  inferred  if  A  were  zero  (i.e.,  if  the  three-  and  four-way 
matrices  were  identically  zero).  Thus  the  analysis  must  be  carried  out  to  second  order  to 
obtain  nontrivial  results.  Otherwise  the  basic  approach  taken  here  is  the  same  as  that  in 
Reference  2-first,  recursive  equations  are  developed  for  the  evolution  of  the  parameters  of 
an  appropriate  approximation  to  the  conditional  probability  density  of  the  state  given  the 
available  measurements;  then  a  dynamic  programming  analysis  is  used  to  determine  an 
approximation  of  an  optimal  control  law.  In  this  case,  however,  the  problem  is  analyzed  in 
terms  of  the  augmented  state  vector 


and  a  second-order  Edgeworth  expansion  must  be  used  to  approximate  the  conditional 
density  of  this  augmented  state  to  order  A2.  Considerable  simplification  occurs  because  of 
the  special  structure  of  this  case  as  a  nonlinear  control  problem,  though,  and  only  the  first, 
second,  and  some  third  central  moments  of  this  Edgeworth  expansion  actually  need  to  be 
computed  for  the  purpose  of  determining  the  largest  nontrivial  effects  of  0-uncertainty  on 
the  optimal  control,  meaning  the  asymptotic  form,  as  h  approaches  zero,  of  the  departures  of 
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the  optimal  control  from  that  for  the  corresponding  LQG  problem.  These  departures  are  all 
of  order  hi  when  the  time  fe-  to  over  which  the  system  is  being  controlled  is  of  order  unity, 
but  can  attain,  at  least  in  formal  appearance,  magnitudes  of  order  unity  when  this  control 
interval  becomes  of  order  1/A2.  The  analysis  here  will  not  be  even  formally  valid  for  control 
intervals  of  larger  order  of  magnitude.  It  is  necessary,  however,  to  compute  fourth  central 
moments  (and  the  other  remaining  parameters  of  the  Edgeworth  expansion)  in  order  to 
implement  an  alternative  approximation  of  the  optimal  control  law  which  has  the  same 
formal  degree  of  asymptotic  accuracy  (in  h )  but  which  might  perform  better  for 
noninfinitesimal  h.  These  fourth  central  moments  are  also  necessary  for  determining  the 
conditional  state  density  to  an  accuracy  of  order  A2. 


IV.  MOTION-STATE  AND  PARAMETER  ESTIMATION 


To  determine  the  joint  conditional  density  of  x(t)  and  0(1)  to  order  A2,  given  the 
measurements  up  to  a  generic  time  t,  the  estimation  process  is  considered  in  discrete  time 
increments  of  size  St,  with  8f  <  <  A6  so  that  6l3/2  <  <  h^St.  The  result  is  then  established  by 
what  is  basically  an  induction  argument.  It  is  assumed  as  an  induction  hypothesis  that  the 
joint  conditional  density  p[x(t),  0U)/y(f)],  where  yit)  denotes  the  measurement  history  up  to 
time  t,  has  the  form  of  a  certain  class  of  Edgeworth  expansions  except  for  an  error  of  higher 
order  than  A2.  This  class  of  density  approximations  includes  the  Normal  joint  prior  of  xUq) 
and  0(fo)  as  a  special  case,  and  is  shown  in  complete  detail  for  scalar  x  and  0  in  the  appendix. 
The  induction  argument  is  then  completed  by  calculating  the  corresponding  conditional 
density  at  time  t  +  8  t  to  a  formal  accuracy  of  order  A28f.  The  calculation  shows  this  latter 
density  to  be  of  the  same  form,  with  each  parameter  having  changed  by  only  St  times  its 
hypothesized  order  in  h  and  the  error  by  an  amount  of  higher  order  than  A28f.  Since  this 
error  therefore  remains  formally  small  compared  to  A2  when  accumulated  over  a  time 
interval  of  order  unity,  the  desired  result  is  obtained  for  t  —  to  of  this  size.  It  is  also 
reasonable  to  consider  this  result  formally  valid  for  longer  elapsed  times  on  the  basis  that 
the  joint  x,0  system  is  "sufficiently  observable”  that  the  effects  of  the  error  increments  are 
damped  out  rapidly  enough  over  time  that  their  cumulative  effect  always  remains  small 
compared  to  A2.  No  investigation  has  been  made  of  precise  conditions  required  or  sufficing  to 
make  this  happen,  however.  The  resulting  equations  governing  the  evolution  over  time  of 
the  parameters  of  the  second-order  Edgeworth  approximation  have  the  formal  appearance  of 
being  stable  in  forward  time  with  time  constants  of  decay  which  are  short  enough  that  these 
parameters  at  least  would  remain  of  the  proper  order  of  magnitude  indefinitely  under 
normal  circumstances.  The  control  u  is  treated  as  a  known  deterministic  time  function  in 
this  section. 

The  rest  of  this  section  is  mostly  devoted  to  a  description  of  induction  step  calculations 
and  the  final  result.  These  calculations  are  too  lengthy  to  be  presented  in  detail  here,  so  only 
a  rather  specific  outline  of  the  procedure  is  given.  The  calculations  are  carried  out  in  detail 
in  the  appendix  for  the  case  of  scalar  x  and  0,  however,  except  for  any  consideration  of  the 
higher-order  error  terms,  which  are  simply  deleted  whenever  they  occur  there. 

In  the  calculations  of  the  induction  step,  the  transition  of  the  composite  state  from 
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*m  .  jrf<+8f> 

9m  0u+8<> 


is  approximated  to  order  /»2S/  by  the  following  sequence  of  transformations: 


1. 


I+F8t  '  2'Vu&t 

“hoTT-  I 


ir*»i  i 

0 

+ 

-v» 

* 

T 

udt 


(where  the  t  argument  is  suppressed  except  for  x(t)  and  8(0) 

2. 


e. 


3. 


2tr<reixJ’)  +  WFQjeJ’lXj  +  tittle  JV 
__ 


:3  =  L2  +  VsT* 


8 1 


ej  e, 


4. 


w  t  is  Normal 


[o,(l  + 1  8j)Tq(i+  2 'e2)| 


given  xz  and  02  . 


*<t+8« 

0<t+8« 


+  VsT 


u;  2  is  Normal 


Starting  with  the  induction  hypothesis  for  pfxtt),  8(t)/y(t)l,  the  joint  density  of  each  of  these 
variables,  also  conditioned  on  yU),  is  then  computed  in  turn  to  order  hftt  according  to  the 
following  outline  for  each  variable  in  the  sequence: 


1.  Since  u  is  a  known  constant,  the  joint  density  of  xi  and  9i  given  y(t)  follows  easily 
from  standard  results  for  linear  transformations  of  random  variables. 


2.  From  the  definitions  ofx2  and  82,  either 


*.  =  *2-6,[2u{rv0  +  *'('V.r)*» + 

to  order  h2&t  or  Ixil  >  >  1  for  X2  and  82  of  order  unity.  Also,  the  Jacobian  matrix  whose  ij-th 
element  is 
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can  be  expressed  as 


F+2r\  +  t^02ej)  j  2rTx2  +  +2trT^LQ2uTj 


whose  determinant  is  1  +•  8t  [tr(F  +  2r02)  4-  tr  tr  (F02027’)]  to  order  k25t.  The  joint  density  of 
X2  and  02  can  then  be  obtained  to  this  accuracy  from  that  of  X[  and  0i  for  X2  and  82  of  order 
unity  by  applying  the  standard  formula  for  the  transformation  of  probability  densities, 
because  large  values  ofxi  and  61  have  negligible  density  and  xi  — 12  is  otherwise  of  order  h&t. 
This  approximation  is  also  adequate  for  large  X2  or  02  because  xj  and  0]  cannot  both  be  of 
order  unity  then,  and  the  density  is  again  negligible.  This  result  is  converted  to  the  desired 
Edgeworth  form  by  expanding  it  in  a  Taylor  series  about  the  mean  and  covariance  matrix  of 
X2  and  @2,  and  deleting  terms  which  are  negligible  compared  to  k28t.  This  mean  and 
covariance  matrix  can  be  evaluated  directly  to  this  accuracy  from  the  definition  of  xo  and 
what  is  known  about  the  joint  density  of  x\  and  6{. 

3.  The  joint  density  of  X3  and  83  is  calculated  according  to  the  standard  formula 

p  (x,0)  =  (  p  (x  -wVSF,6)p-  (u/,x,d)  dw 
*3  *2  “V*2 

for  the  sum  of  two  random  variables,  where  the  conditioning  on  yU)  is  suppressed  in  the 
notation,  and  where 


.-121 


and  *3  =  [Oj 


This  integral  can  be  evaluated  to  an  accuracy  of  order  W8f  by  "completing  the  square"  for 
the  product  of  the  two  exponential  factors  and  using  standard  results  for  evaluating  the 
moments  of  Normal  distributions.  It  is  then  straightforward  to  evaluate  the  mean  and 
covariance  matrix  of  S3  to  order  h2$t.  Expanding  the  density  approximation  obtained  for  S3 
in  a  Taylor  series  about  these  values  and  deleting  terms  small  compared  to  h2^.  gives  the 
result  in  the  desired  Edgeworth  form. 

4.  Obtaining  the  joint  density  of  x(t+5t)  and  0(t+  A)  (to  order  A*8 1)  from  that  of  X3  and 
83  is  conceptually  just  a  repetition  of  the  previous  computation,  the  formula  for  the  sum  of 
the  appropriate  two  random  variables  this  time  being 


p.  (x,6)  a  [  p  (x,0  -wVT)p-  (w)  dw, 
*4  1  -«  *3  **1^3 


where 


*4=  «!+&)  ' 
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Actually,  w<i  is  statistically  independent  of  s 3  and  the  conditioning  indicated  above  is 
superfluous.  Because  of  this  fact  and  the  small  variance  of  W2,  the  details  are  much  simpler 
than  in  the  previous  computation. 

Once  p(x(t  +  6t),d(t+6t)/y(t)]  has  been  obtained  in  this  way,  the  induction  step  is 
completed  by  approximating  the  effects  of  the  measurements  between  t  and  t  +  8t  by  a 
random  variable Tsuch  that  pf27xU+8t),  9U  +  8i)]  is  Normal  and  conditionally  independent  of 
y(t)  with 


mean  (2)  -  tf(t+St)x(f-l-8t)  +  2X'(t+8t)9(f+8t)+  2tr  [Q(t  +  8t>0U + St)xT(t + 6t)| 
+  trfHU + 8*)0(t + St)0T(/ +8/)]x(/+ St) + trt  A  (t+8t)Q(t+ 8t)QT(t + 8t)l 

and 


cov  (2)  =  —  R(f+8t) 

St 

which  is  formally  accurate  except  for  terms  of  order  St^,  and  therefore  accurate  to  order 
A2fit  since  8f<  <h*.  This  construction  allows  plx(t+8f),0(t+8t)/y(t+8t)l  to  be  determined  by 
the  Bayes  rule  to  this  accuracy  as 


p[x(f +8t),0(t+8tyy(t)l  •  plz«t+6f),0(t+8t)| 


divided  by  its  integral  over  the  x(t+6t)  and  0(t-f  5t)  arguments.  Expanding  this  density  in  a 
Taylor  series  up  to  terms  of  order  h?8t  about  its  mean  and  covariance  matrix  (treating  z  as  a 
quantity  of  order  8 1~ 1/2)  then  gives  the  final  result  described  above  for  the  induction  step. 
Zero-mean  random  terms  of  order  VST”  but  of  third  or  higher  order  in  h  are  also  deleted  in 
this  last  expansion  if  they  are  statistically  independent  for  disjoint  time  increments  (of  size 
8<)  because  it  follows  from  the  Chebychev  inequality  that  such  terms  only  contribute  effects 
of  order  h *  when  accumulated  over  time  intervals  of  order  unity,  except  perhaps  for  a  set  of 
realizations  with  negligible  prior  probability. 


The  second-order  Edgeworth  approximation  for  p(x,0/y)  which  is  established  by  this 
induction  argument  has  the  form  of  a  Normal  density  multiplied  by  a  polynomial  in  x  and  0 
whose  leading  term  is  unity.  The  parameters  of  this  approximation  are  the  mean  and 
covariance  matrix  of  the  Normal  factor,  which  are  also  (to  order  A3)  the  mean  and  covariance 
matrix  of  the  joint  conditional  density  of  x  and  0,  and  a  set  of  quantities  determining  the 
coefficients  of  the  polynomial  factor,  all  of  which  are  small  compared  to  unity.  These 
parameters  are  denoted  here  as  follows,  and  have  the  following  interpretations  in  terms  of 
this  joint  conditional  density: 


x  =  E(x) 
0  =  E  (0) 


partition  of  mean  vector 
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M  =  E  [(x  -  x)  (x  -  £)rl 
E  =  E[(x-2)(0-§)rl 
L  =  E[(0 -§)(0  -  3)T1 


partition  of  covariance  matrix 


A  =  *E{[(x-x)(0  -  0)r](x-  if) 

V  =  *E  {[(0  -  0)  (x  -  x)T1(0  -  §)T} 


two  partition  components  of  three-way 
matrix  of  third  central  moments 


N,  a  four-way  matrix  whose  ijkm-th  component  is  (to  order  h?) 


1 

-  E 
4 


(x-x)  (0-0)  (0-0).(x-i) 

i  j  n  n 


-  M  L .. 

4  im  jk 


1 

-EE  .  . 
2  v  «* 


Also,  the  components  of  E  and  A  are  of  order  h,  and  those  of  V  and  N  are  of  order  h2.  As  in  the 
prior  density,  x,  M  and  L  are  of  order  unity.  From  Equation  2,  0  is  the  time  integral  of  a 
control-dependent  term  of  order  h 2  (assuming  control  values  are  kept  to  order  unity)  and  a 
white  noise  term  with  variance  parameter  of  order  h 2,  so  0  will  be  of  order  unity  for  the 
estimation  intervals  of  order  1  Ih?  being  considered  hgre.  Since  the  posterior  variance  L  (of  0) 
is  of  order  unity,  this  means  that  the  posterior  mean  0  is  also  of  order  unity. 

If  the  calculations  of  the  induction  step  are  carried  out  according  to  the  preceding 
outline,  the  eventual  result  is  a  coupled  set  of  recursions  for  generating  these  parameters  to 
order  h2  from  the  incoming  state  measurements  (see  the  appendix).  If  z  is  used  to  denote  the 
average  of  the  measurement  vector  over  a  current  discretization  time  increment,  these 
recursions  can  be  expressed  as  measurement-driven  differential  equations  in  terms  of  the 
formally  corresponding  continuous-time  variables.  With  the  time  argument  t  suppressed  in 
the  notation  and  with  the  definitions 


£  =  eq), 

z  =  Hx  +2 ire  +2tr  |fl(0xr  +  ET) ]  +tr  [  A  (e0r  +L)j  +tr  [hWB7,  +£,)]  £, 


(5) 


(6) 


and 


for  the  time  function  P  defined  by 

P  =  FP  +PFT  +Q  -  PHTR "  lHP ;  P{tJ  =  PQ  , 


(7) 


(8) 


these  differential  equations  become  the  following; 
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x  =  Fi  +Gu  +  2  tr  [i^Gx7  +£rjj  +2  tr('V6uTj 
tr  |f^007  +l)|  i  +tr  [^L'u^00r  +  L  j]  +  j(p  +  2I>)  |ffr  +  2Q0 
rT^H00rj|  +2  \e(kT  +  flTx)  +  tr^A'fl7 j  +  tr^A'Qj  |  fl_1(2-  rj; 


A  A 

x(V  =  x0 


0  =  Tu  +  [ET(tfT  +2fl0  j  +2Z,[a'7’  +  Q7x  +  Ar0 

+  t^H'^O7)]  [  IT l(2  -  j) ;  0^ J  =  0 


D  =  [f  - PHTR~lHy >  +D(pr-Hr«-1/ff»)  +  [^r-Q'R~1HPj 

+  (rT-  phtr~1Q't^et^  x  +  [(r-  phtr~1Q’^p  +20) 

+  (p  +2D)(rr'-  QT'R~1Hpj  +s|  0  +  (e'F  +'V,TETJu 

+  i  |ptrrjF^00r  +L  +  tr  |f^907  +£.j|p  +  tr|  £ '^007  +£^S  ,rQ  J  J 

-  PHTR~X j2  tr^QA’J  +KET  +  j  tr  [h^©^  +L )|pJ 

-  j2tr^A'Or)  +  iptrr[H^00r-(-L^j  +EKT^R~lHP 
-  2  (dht  +pq0)r_1[hp  +(arpy§  j  +  [[ \'L-xET 

+  (a  ’L~lET'j  +(\'L~lETJ  j/fr  +(eQtP  +PQ£t)'  +2\(kT 
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E  =  (f -PHTR~Xh)E-2PHTR~1KL  -2DHTR~1(hE  +KLj  -4- 2  [ ( TZ- 
-  [lQ'R~1h(p  +2Dj|'i  +\ETr  -Et(htR-1Q"  +Q"TR-lHjp 
-  l(  KtR  -  lQ*  +  A  R  -  1hJp  J  ’8  +  ( *VL  j*u  +  tr  j  (  F"  -  PHTR  ~  ‘H"  jL0*r  ] 

+  trr|(lX  j'u6r|  J  +  2{a(r7V2Q0  j  +  [z(ftr  +  Hejp 

+  2L0TD|'jR'‘(z-ij;  =  0  (12) 

L  =  e2Q2  —  ETHT  +  2L([KT  +  QT^R-i^HE+2KL  +2(nL^|  +  2j  V’ffT  +  ETQL 
+  LQTE+l(a  +  ^H'x^L |  '|r~1^z— z^;  =  LQ  (13> 

A  =  (f  -  PHTR~1h) A  +  A (fT-HTR~1HP^  +[prL^'  +(Lrrpj'  -PHTR~l(pQLy 
-  ^QP^'R-^P  +£,£'  +2[(r£,y  -  AHTR'lff -PHrR"1(QI.y-p(LQT),R'1H|D 
+  2D|^trrj'  -HTR~lHA  -(l,Qr)'R“lHP  -^R^^QL^p]  +  [p[fo)t£.]' 

+  |l(f0)p|'  -PHrR_1[p(H0)rL|'-[£,(Hejp|'R'tflP-2p(Q0^R_l(pQLy 
-  2^LQrP j'R'^Q'7©  jp  -2[a(pH7’R"iQ*  +PQ7’'R',h)'  +  (p/frR'1Q'A 

+  PQr'R"IHAj']0  +  iJ[(s£rQj'r +^2frQj"r]t|,,r'0;  A^J  =  0  (U) 

V  =  |^F-.PffTR“lff^V*]'+£rri,  +LrrE  +  j  jl,|ti|FL^x  +ti^LL^ujT|' 
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■  ETHTR-^LQTpj  -  (pQL'j’R-'HE  -  [e7#7,  -^(V'  +  Q7^  ^"‘(a  +LSlTpj 
(a  +  PQL J"R “ 1 1 HE  +  2KL  +  2^«Z.  j'x |  -  l(  A'R~lHPj'L  +  2l[q,JI“1(*  -  P||a' 
l'R_1(z- z)|TL  +2[n//T«'1(«- z)|';  v(t0)  =  o  (15) 


+  2A" 


N  =  (f- PHtR  -  lH )n  +  N^Fr  -  HTR  ~  lHP  j  +  j  | p(l.FL  j"  +  (lFTL  )'P 

—  P^LHL^”R~lHP  —  PH7/1'"1^LH7'l  j"P  J  -  ^LSrQ2L 

-  [AHr+(LQrpj']p"l|/fA  +(pQ£,)'|  -  \HTR~1(pQL  j* 

-  ^LQrP^'R-lHA  -  |^L2rQ2L  +\HTR-l(pQL ^  +(LQ7’P^'R_1ffA 

+  |  Ai/7  +^LQTP^'  ]p~l|/fA  +  ^PQL  j*J  j’r ;  n(^  =  0 


(16) 


If  the  formalism  of  these  differential  equations  is  interpreted  in  the  sense  of  Ito  equations 
with  z  now  denoting  the  original  state  measurement  vector,  these  equations  generate  the 
same  parameter  values  as  their  discrete-time  counterparts  to  a  formal  accuracy  of  order  k2 
for  time  discretization  intervals  8t  <  <  h*.  These  Ito  differential  equations  thus  determine 
the  joint  conditional  density  of  x(t)  and  Git),  and  the  corresponding  conditional  moments 
indicated  earlier,  to  a  formal  accuracy  of  order  h2  in  the  original  continuous-time  problem. 
It  also  follows  from  the  form  of  these  equations  that  P  =  PT,  D  =  DT,  A  =  AT.  V  a  yT,  N  = 
Nr  and  N'  =  S’r,  which  is  also  necessary  if  these  quantities  are  to  have  the  conditional 
moment  properties  ascribed  to  them  earlier. 

For  state  estimation  purposes,  ft  is  sometimes  preferable  to  work  with  the  quantity  A ( 
directly.  In  this  case  one  can,  for  the  same  order  of  formal  accuracy,  integrate 


M=  {F+2rr0  +tr[F^9er+£.)||*f+2(ri+*ru)Er 
+  M  [pr  +2re  +trr[i|e0r  +l)|  j  +2s(rTx  +  «Pru)  +  Q  +2£'0 
+  tr|  E'(l  +00r)2'rQ|  -  [w[ff7'+2Q0  +trr^H00r  +  HI*j  j  +4tr^A'Qrj 
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+  2E^KT  -t-Q7xjj/?-1jJ/f  +2Q’T§  +tr^H8flr  +  HL^|Af  +4  tr^QA'j 
+  2^K  +  Q*x^£rj  +  2]  |A'L_lEr  +(A'L~lETj'  +(a'L~1ET  j' ] // r 

+  (eQtM  +MQeJ  +2A’^KT+QTx)j^R-l^z-2y.  itf(f0j  =  P0  (17) 

•  * 

in  place  of  P  and  D,  and  replace  P  by  M  and  D  by  zero  everywhere  in  Equations  9  and  10  and 
12  through  16.  It  is  also  sometimes  inconvenient  to  work  with  appropriately  normalized 
motion-state  and  parameter  vectors  x  and  0.  In  the  scalar  case  examined  in  the  appendix, 
the  variables  of  actual  interest  are 


x  6 

Tu^vr 

and  the  (formal)  validity  of  the  approximations  made  in  the  derivation  really  depend  on 


being  of  order  unity, 


M 

R 


Q 

M 


e. 


_0_  _T_  _ 

vr  vT1 


being  of  order  h,  and 


LLu  AL 

FL,  ^  H L  and 


being  of  order  A2,  or  these  same  conditions  with  M  replaced  by  P.  In  the  multivariate  case 
this  validity  actually  depends  on  some  reasonable  multivariate  extension  of  these 
conditions.  The  dependency  structure  of  the  equations  of  motion  and  measurement  probably 
plays  an  important  role  here,  however,  which  complicates  the  situation. 


V.  CONTROL  OPTIMALITY  CONDITIONS 

The  optimal  control  law  approximations  here  are  derived  by  applying  conditions 
resulting  from  a  standard  type  of  dynamic  programming  analysis.  The  first  step  is  to 
consider  the  conditional  expected  "cost-to-go"  under  an  optimal  control  law  at  a  general 
time  t  «s  to,  by  which  is  meant  a  generalisation  of  the  definition  of  Equation  4  in  which  the 
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integration  is  from  t  to  tf,  the  expectation  is  conditioned  on  the  controls  and  state 
measurements  up  to  time  t,  and  the  control  values  u( z),  /  S  t  S  tf,  are  generated  from  the 
measurements  available  at  time  i  by  an  optimal  control  law.  This  expected  cost-to-go  is  a 
function  of  time  and  the  currently  available  measurement  data  (including  past  control 
values),  but,  because  of  the  statistical  independence  of  future  process  and  measurement 
noise  values  from  the  current  composite  state  (x,0),  the  cost-to-go  depends  only  on  this 
state’s  current  conditional  distribution  (Reference  3).  This  distribution  is  specified  via  the 
multivariate  extension  of  Equation  A^l  to  order  hr  by  the  current  values  of  the 
measurement  statistics  x,  8,  D,  E,  L,  A,  V  and  N.  Assuming  now  that  some  arbitrary  but 
reasonable  contrpl  law  has  been  specified,  the  noise-induced  fluctuations  of  all  these 
statistics  except  0  and  L  from  their  prior  expected  values  (for  this  choice  of  control  law)  will 
become  essentially  independent  over  time  intervals  of  order  unity  because  each  is 
multiplied  by  the  coefficient  matrix  ( F—PHrR  ~  lH)  in  the  differential  equation  generating 
it  from  the  measurements  in  the  system  of  Equations  5  through  16.  The  observability 
properties  being  assumed  for  this  problem  imply  that  this  coefficient  matrix  will  have  the 
effect  of  "damping  out”  the  effects  of  these  fluctuations  over  longer  time  intervals.  In  using 
Equation  13  to  generate  L,  on  the  other  hand,  the  contributions  of  these  fluctuations  and  of 
the  last  measurement-driven  term  are  effectively  integrated  over  observation  intervals  up 
to  order  l/h2  since  Equation  13  must  be  integrated  this  long  (except  for  a  negligible  set  of 
realizations)  to  change  L  significantly  compared  to  its  typical  size  of  order  unity.  Thus  the 
departures  oi  L  from  its  prior  expected  value  over  an  interval  of  this  length  will  essentially 
be,  in  order  of  magnitude,  the  sum  of  l/h2  zero-mean,  statistically  independent  terms,  each 
representing  the  effect  of  such  contributions  over  a  time  interval  of  order  unity  minus  its 
prior  expected  value.  From  Equation  13  and  the  orders  of  magnitude  of  the  quantities 
involved,  each  of  the  terms  in  this  sum  is  of  order  h2,  and  therefore  a  random  variable  whose 
variance  is  of  order  h*  From  a  standard  result,  such  a  sum  is  a  zero-mean  random  variable 
with  variance  of  order  h2.  There  are  fewer  terms  in  this  sum  for  shorter  observation 
intervals,  and  its  variance  would  then  be  even  smaller.  By  the  Chebychev  inequality,  this 
means  that  the  departures  of  L  from  its  prior  expected  value  are  of  order  h  over  observation 
intervals  up  to  order  l/h2  (except  for  a  set  of  realizations  with  negligible  probability,  which 
will  l>e  ignored  henceforth).  Approximating  L  by  its  prior  expected  value,  and  6  and  D  by 
zero  in  Equation  14  therefore  changes  A  by  order  h2.  Since  the  "damping  coefficient  matrix” 
(F—PHTR~lH)  multiplying  A  in  Equation  14  is  such  that  these  errors  only  accumulate 
over  time  intervals  of  order  unity  in  the  integration  of  this  equation,  the  error  in 
approximating  A  in  this  way  is  formally  only  of  order  h2.  This  approximation  of  A  is  a 
deterministic  time  function,  and  is  in  fact  the  prior  expected  value  of  A  (to  order  h2,  the 
accuracy  of  the  state  estimation  analysis).  Likewise,  approximating  both  L  and  A  by  these 
determinis^c.tiipe  functions  in  Equations  6,  9  through  12,  15,  and  16  changes  E  by  order  h2 
and  leaves  2, 8,  D,  v,  N  formally  unchanged  to  this  order  of  accuracy. 

Now  consider  the  possibility  that  the  optimal  conditional  expected  cost-to-go  is,  to  an 
accuracy  of  order  h2  except  perhaps  for  a  set  of  realizations  with  negligibly  small 
probability,  a  scalar-valued  function  (also  called  J  here)  of  the  form 

J  =  jxr(s  +  trj*  +(< +n  Y*  +*Te  +tr[(s  +  rjo  +evt  | 

+  +*TNe  +X7  tr(firir)  +6rtr(*D) 
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+  l-QTtr(^T)  +  iirtr(2'Ser)  +  tr  [z>tr(lt£>) 

+  i  tr  [fftr(BEr)|  +  ^xTtr(mTj: 


x  +<J  , 


(18) 


where  W,  q,  <}>,  Y,  V,  U,  N,  n,  V,  H,  2,  R,  B,  I  and  o  are  deterministic  time  functions  and  the 
time  argument  t  is  suppressed  in  the  notation,  where  S  and  £  are  the  time  function  specified 
by 


and 


S  =  -SF -FtS  -  A  +  SGB~lGTS  ;  S{tf)  =  Sf 


4  =  (^SGB~*GT —FT^i,  +SGB~lC\  <{lr)  =  0  , 


(19) 


(20) 


where  x,  0,  D  and  E  are  the  moment  parameters  of  the  conditional  state  density  at  the 
current  time  t  approximated  to  order  A2  by  Equations  5  through  16,  and  where 

a  is  of  order  1/A2, 

$  is  of  order  1/A , 

Y,  U,  R  and  B  are  of  order  unity, 

V,  N,  II,  V,  H  and  q  are  of  order  A , 

W,  2  and  I  are  of  order  A2 , 

W,  Y  and  U  are  symmetric, 

H  =  Hr,  2'  =  2,r,  V  =  VT , 

i  =  iT,  r  =  rT , 

B  =  B*.  and 
R  =  R*  =  Rr  . 

By  their  definitions,  S  and  P  are  also  symmetric  and  of  order  unity,  l,  is  of  order  unity,  and  D 
is  symmetric  and  of  order  A.  The  nonzero  first  and  second  partial  derivatives  of  this  function 
J  with  respect  to  variables  other  than  t  are 


jjs^js+v+H'e  +  tr^iee^]  +<T-hqr+erAfr+trT^Ern'  +  iseer) ,  <211 
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J*  =  <DT  +  xTN  +  9T\u+2’x  +  tr(riiT)|  +  trT( 

1  AA*  \ 

—  Hxx  +VDJ 

(22) 

JD  =  S  +tr(R d)  , 

(2.1) 

J£  =  VT  +  iT*  +  tr(  BETj  , 

(24) 

Jaa  =  S  +  W  +  H'0  +  tr(l90rj  , 

(25) 

«/£=  AT  +  H'x  +S'9  +  2tr(  I'x8rj 

. 

(26) 

(27< 

«/~=  (Z+a’i  +  tr^r»Tj  , 

(28) 

Jm  =  v  ’ 

(29) 

qq  —  R  »  and 

(30) 

=  8  ’ 

(31) 

where  subscripts  now  denote  partial  differentiation  with  the  convention  that 


/  \  3*J 

jijkm  =  a E  dE' 

im  kj 

Since  the  other  second  partials  are  zero,  the  usual  invariant  imbedding  formalism  of 
dynamic  programming  (Reference  4),  applied  to  a  very  small  time  increment  6/  and  using  x, 


20 


NWCTP6530 


9,  D,  and  E  as  state  variables,  shows  that  the  Bellman  equation  for  the  optimal  expected 
cost-to-go  function  reduces  to  the  following  equation  for  -J  to  second  order  in  h,  at  least  for  all 
but  a  set  of  realizations  with  negligibly  small  probability: 


=™“.E,*»45'r(A  ^  + j«T(B 

+  0'9  +erM0 'ju  +  |eTZ  +  trr[A09rj  +cT  u 


80 

+  Jei7  +tr 


5 D  1  SzS£r  1 8 D 

tJ  ~  "f  ”  c/AA 

D  St  2  “  St 


tr  (  JJ>D  j 
25 t  \  D  ) 

18 E  (  T\|  /  8  E  8x80r\ 


A  A  _ 

(SE  *  SD  a\  I  /  8888 r  \1 

tr  — +  — J_s80  )  +  -  tr  J~ -  )}  • 

V  St  Ex  St  /  2  V  06  St  fi 


(32) 


In  Equation  32, 8x  denotes  the  increment  in  x  that  occurs  during  the  interval  [/,  t+8t]  for  the 
control  value  u,  which  is  presumed  to  be  constant  during  this  interval  with  negligible  loss  of 
generality,  and  likewise  for  88, 8 D  and  SE.  At  the  final  time  t.,  the  integral  term  of  Equation 
4  vanishes  in  the  definition  of  the  cost-to-go  function,  so  J  must  satisfy  the  condition 


j(tf,Xf,QfJ)f,Ef^  -  E/<ht-y[  +9*ABf)xf+  2%Tf(^fxfxTf  )[ 


to  second  order  in  hfor^all  but  a  negligibly  improbable  set  of  the  realizable  values  of  the 
final-time  statistics  x^,  0^.,  Df  and  E^.  From  the  moment  properties  and  orders-of-magnitude 
of  these  statistics,  the  boundary  condition  of  the  Bellman  equation  for  J  therefore  becomes 


■{*,&»*)  -  i?[s, +  “K)l  +  j  "  +2D) 

+  7  tr[erH^  +  trtr^HfAfJ 


(33) 
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to  order  /i2,  where  L  and  A  now  denote  the  deterministic  approximations  described  at  the 
beginning  of  this  section. 

The  limiting  process  here  is  such  that  St  can  be  regarded  as  arbitrarily  small  compared 
to  h,  so  that  terms  of  order  8 are  negligible  compared  to  those  of  order  h?St.  Therefore, 
since  Equations  9  through  12  are  to  be  interpreted  as  the  formally  corresponding  Ito 
equations,  and  since  the  measurement  noise  o  in  the  interval  <t,t  +  St)  is  independent  of  the 
composite  state  (x,9)  at  time  t, 

E(x-x)  =  0 

and 

E[(z  —  3)  (a  —  =  R(t) 

conditioned  on  the  data  available  at  time  t,  and  the  following  conditional  expected 
increments  can  be  evaluated  from  these  equations  to  a  formal  accuracy  of  order  k*St  for  any 
given  u(t): 

e(^  j  =  Fx+Gu  +2tr|r(9xT  +£rj  +2tr(‘P9urj 

+  tr  | F^60r  +  t)|*  +  tr  |(L'uj(e0T  +  (34) 

E(i?)  =  T“  1351 


E(f)  =  (p  -PHTR~lHy>  +d(ft-HtR~1HP )  +  [fi(n-.Q'R~lHp) 

•  (rT-  phtr  ~  1q’t^et  |  x  +  |(r  ~phtr~1q"^p  +2/?)  +  (p  +2D^rr’ 

-  QT'fl_ltfpj  +  l|  0  +(ev  +'PTEr|tt  +  iptrT|F(9er+L) 

+  tr[F(eer  +  £,j|p  +  i  tr[  L’(00r+L)s’rQ]  ~PHtR~x\kEt 
+  2tr(nA')p  +  j  tr  [h(§0t  +  l)p|  [  -  \eKt  +  2Ptr(A'Qr) 

+  iptr[H(007’+Lj|jp_lHP-2(offr+PQ0j/e  ^AP  +  ^pjoJ  (36) 
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)  =  (f  -PHtR-1h)e  -2 PHTR~lKL  -2 DHTR~1(hE  +KL^J 


rTR~lQ' 


+  2(rLj’i-2  LQ'R~1h(p  +2£>J  x+2  £rr  -ET(^H 
+  VtR-'h)p  ~l(ktR~1  cr  +  A'R_l/fjp|'9  +  2  tr  | ^ F" 

-  PHTR_lH"jL0xr|  -t-2(lPL  j'u  +2trT  ( LX  j'er | 

E(^L)=  j(p+2Z))  hT  +2qq  +  trT^H09Tj  +2  e(kT +QTxj 
+  tr(A'nrj  +  tr^A'Q  j|  |  +  2Dj[/fr  +  2Q0  +  tr^HeS7') 

+  2^e(kT  +  tr(A’Qrj  +  tr(A'flj  T 

E(~£^~)  =2|(P  +21,)[ffr  +2Q®  +  trr(H08rj|  -K-aJ^A:7’  +QTi) 
r(\'QTj  +  tr^A'o)]  jfl-1  K  +Q'i  +  AT)  +  trr(H'^8r)|L 


+  tri 


E(  )  =  2 1  A(Hr  +  Z°®  )  +  [  L(°r  +  H®)P  +  2LQr°  I  1 R  * 1 1  (P 

•f  20)[ffr  +  2Q0  +  trr^H00rj]  +2|fi(/Cr  +flrx)  +  tr^A'Q7^ 

+  tr^A'fl^l  jr 


(37) 


(38) 


(39) 


(  8880  r > 

\  —  1/  1 rT  4»  A  ±  trl  H  P- H  V  j.QfJ  X  A  4- 

A  hi  J 

i  —  4L  a  tw  xta  o*ririti  xu  i  a  a  tui  t  a  o  t  ir  i 

[Hx 6  /I 

(40) 


L 

(41) 


'I 

;  j 
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e(  )  =  2  j  |  A'L-1^  +(a'L~1EtJ'  +(AX",£T'j'|//7'  +(eQtP  +PQETj' 

+  2a(kT  -HQrij[/?_l[x'+Q',i  +  I’e  4- trr(  H'^  j  L  i 

E(  5D^8-  )  =  1 1  A'L ~ lET  +  ( A'L ~  lErJ  +(A'L-lET  j\HT  +(eQTP  +  PQ£rj' 
+  2A^Kr  4-Qrx  j  Jr  " 1  j/f|££,-1A*  +^EL~1A’j  +  (EL_lA*j'| 

+  (eQTP +PQETf +  2^KA  +(QAjx|J  ( 

e( 8E&8E  j  4jA^Hr  +2Q0  j  +  +  Hejp  +  2LQro|' |ft~l 

)^A  +  |l^Q^  +  H8^f 


+  2fl'r8 


Qr  +  H0lP+2LQrD<'r 


(44) 


The  moment  properties  of  x,  0,  P  +2 D,  E,  L  and  A  can  be  used  to  evaluate  terms  not 
involving  these  increments  in  the  right-hand  side  of  Equation  32  to  order  A 2  as 


^t[a  4-  tr^GL  j|x  +  i£Ttr(c00r)i  4-  tr^Aoj 
+  i0rtr(G'pje  +  £0rtr  [d^w7,  +2d)|  +  -  trr($ >P^0 
+  7  tr(tfV)  +  +®'9  +  tr[ftl(00r  +£.)|)«  +  jc7"  +0  TZ 

-*■  tr  r|d^00r4-£.  j|  |u  4-  j  tr^Ap'j  +  trtr^«$A J  . 

The  order  of  accuracy  of  Equations  34  through  45  is  retained  if  L  and  A  are  regarded  as  the 
deterministic  approximations  mentioned  earlier.  Furthermore,  Equation  35  is  actually 
accurate  except  for  terms  of  order  VStwhich  are  negligible  to  order  hr.  This  is  because 
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3(t+StJ  = 


0(t)  +  Tubt  +  a) 


‘~i: 


e(t)u>2dt 


except  for  discretization  errors  of  order  St2.  Since  u  is  a  zero-mean  random  variable  that  is 
independent  of  0U),  E[8(f -t-8t)/yU)|  =  0(t)  +  Tu8t  to  this  accuracy  by  definition  of  0  and 
elementary  properties  of  the  expectation  operator.  So,  since  0«  +  St)  is,  conditioned  on  y(t), 
equivalent  (except  for  errors  of  order  St372)  to  the  prior  expected  value  of  0(H-8t)  conditioned 
onz'las  defined  in  the  previous  section),  which  is  just  E[0(t  +  8t)/y(<)|  by  decomposition  of 
expectation. 

Since  expectation  is  a  linear  operation,  Equations  34  through  45  can  be  substituted  for 
the  random  variables  in  the  right-hand  side  of  Equation  32  to  eliminate  the  conditional 
expectation  there.  Since  Equation  35  is  accurate  to  order  h3,  this  substitution  is  accurate  to 
order  h 2  even  though  Jg  is  of  order  l/h  by  assumption.  It  would  be  needlessly  tedious  to 
display  the  entire  result  of  this  substitution,  but  the  terms  depending  on  u  are  of  the  form 


where 


-uTXu+yT 
2  “ 


X  —  B  +0'0  +  tr  [M(e0r+£,j| 


y  =c+Zr0  +  tr[A^00r+Lj|  +  [g%2V0  +  trTJl^00r +L^|  JjJ 

+  TTjT  +2tr(£W0)  +2tr(f.wj)  +2tr(L 'VJ)l8  . 

Setting  the  u-derivative  of  this  expression  to  zero  gives  the  minimizing  u  as  —X~ *y. 
Expanding  X-1  to  second  order  in  h  and  substituting  from  Equations  21  through  24  for  the 
partials  in  y  gives  this  value  of  u  to  order  A2  as 


u  =  «Q  +  ut  +  u, 


where 


u0=  -B-l[cr(sS  +<)t-c|  . 


u,  =  -B'1  Grq+rr^+  Zr+Grtf +2Vr{-e8‘ 


*(Gr<+c)|( 


+  tr  HG  +  2SV-SGB 


-‘e)’S6r|} 
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u2=  —  B~  l(\r  | (  A  +  2*P'V^L  |  +  tr^LL )<  -  trflVIL  )b~  1(gt<  +cj 
+  [gTW  +  j  trr(  LL  j -tr(  ML  jB~lGT  |  S  +  2  tr(«FITL  jji 
+  (2V'rn  -9B-,Tr<t>  +TTu"jQ  +  2tr  Je^S  +i,j'P'7’ 

+  «{Bl.v)  +  i(0rn)|j+tr[ji  +  I(cr2) 

+•  2VN  -eB~l(zT  +GTN+2'Vt'ZJ  +  |l'< 

+  (m  -e’B_1e’jB-l(G^  +c)|'}eer|  +  tr  [  [2^' 

+  [rG  +  SL*r  —  ^HG  +2S»rjB-l0 

+  SGB‘l(e*B*te,-Mj'J’j00rJj?j  .  (49) 

From  the  orders  of  magnitude  being  assumed  for  the  parameters  here,  u(  is  of  order  hl,  i  = 
0,1,2.  Using  this  value  of  u  to  eliminate  the  minimization  in  Equation  32  replaces  the  in¬ 
dependent  terms  from  the  previous  substitution  (for  the  expected  increments)  by 


—  -^«0 +Uj +u2jr  B +0'0  +  tr  M^00r  +L j  ^uQ  +Uj  +«2^  •  (501 


Substituting  from  Equations  47  through  49  for  u0,  u.  and  u2  here,  substituting  from 
Equations  22  through  31  for  the  partial  derivatives  of  «fin  the  other  terms  of  the  previous 
substitution  in  Equation  32,  using  the  fact  that  Jt  is  just  the  right-hand  side  of  Equation  18 
with  dots  over  the  deterministic  time^fimctions,  retaining  only  terms  up  to  order  h2,  and 
equating  coefficients  of  like  powers  in  x,  0,  D  and  E,  shows  that  Equation  32  with  boundary 
condition  (Equation  33)  would  be  satisfied  to  order  h2  by  the  function  J  of  Equation  18  if  o 
obeyed  a  certain  ordinary  differential  equation  whose  details  aren’t  important  here,  and  if 
the  other  coefficients  obeyed  the  following  equations: 
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W=  w(gB~1GTS  —F j  +(^SGB~lGT—FT^jW  —  S  tr^FL  j 
-  trr(  FLjs  -  tr (gL  j  +  2  tr  [  Q’R  ~  lHP  —  T^II' 

+  n'r(pHrR_1n,r-rT^  -Q'R_1ffPH'L  -lulq’r-1q,t 

—  2  tr^H'LQ'R~lHP^  -2tr(Q'TLULQ'R~lj 
+  SGB_1[trr(LZ-)s  +2tr(v'nxj]  +  [str(LZ.) 

+  2tr(nXV'j|B_lGrS  -SGB-ltr^ML^B-lGrS;  1*^  =  0 

n  =  (  SGB  “  !Gr  -  Pr  jq  +  SGB“1|7’*$  +  tr^AL  +  2  V’Vl)  +  tr^LL^ 
-  tr  +2  tr  j  [  Q'R  '  1Hp(  V-  iv)  +/CrR  “  lffp(  II  -  h|  -  TV 

—  2t/LQ'R-l/f  |l|  +  |  WG  +Str^LL^+2tr[n'L»P') 

—  SGB " ltr ^ MX.  j | B-l^Gr^  +cj  ;  q(^)  =  0 


(51) 


(521 


4>  =  tr[^P^-4>-HrR-1HPM)+2PHrR"lQ'PYr-(s  +^)(s  +2Ppj| 

+  (attg  +z  +2«F<;)B-l[Gr((;+n)  +rr<t>  +c|  +(GT+2«p'njB-^Gr< +c) 

-  j  tr  [B-l©B-l^Grt+c^rG  +2qrG  +2*rT+c)|  ;  *  J  Xx{'ifPf)  (53) 

r=  y(pHtR~1H-f')+(htR~1HP  -Ft)y -SGB~1GtS  ;  )  =  0  (54) 
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V  =  (hTR~1HP -FT  jv  +2YPHTR~lK  +  trr(  KtR~1HPBl) 

—  htr~1(^hpn  +2KLu^  -2rrc  +  |2(s  +yjH»*-t-nrG 

+  2tr'r(BL'P'jjB~l(Gr^  +cj  ;  v(tf  j  =  0 


(55) 


i+tr|V|2(  PHTR-lQ'  -r^P  -2 


U=  —  tr^G'P  j 

+  [2^PHTR-1Q'-r)p-s]v  —  rPHTR  ~ lHP  - 2PHTR  ~  'O’PH 
-  2VIPHTR ~ lQ'P  — 2^S  +y)pp-  ^  j(s  +  y)sQ£r  +  Q£r(s  +yj 2  | 

+  2p(H,R~lH  +  Qr«"lQ')pyj  4-2  tr^Q'PypQ7^/?"1^ 

+  j2A'  +  2iVT»l»,'+2('Priv)'+^L'  +  LT|<;-(iVrG  +Z  +  2'P’<)b_10  +  ^(**+  2r) 
-  eB-I(Gr(V+Zr+2Vr<jj'jB-^Gr<+c^  +^NTG  +Z  +2V’^B-l[GrJvj 
+  Zr+2»Pr<;)  +  tr  {b-1[  ^0B“l0  )'+  ^0B-,0  j”r— M  jB~l^G^ 

+  c)(<tG  +ct)J  ;  u[tf  )  =  tr(A'P^)  (5 


N  =  (sGB"lGT-Fr  )iV-2rr(<+n)  +SGB"l[zT  +  TrG+2'PT^+n)| 

+  ^HG  +2S'P*-SGB_l0 ^B“l|Gr^  +Tr«J>  +c|  ;  Af(t^  =  0  *57) 
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n  =  (sGB_lGr-Frjn  +n (phtr~1h -f'j  -<u-2(sr+rrs) 

-  HPHT  +  2^ULSlT^  |b  _,H  +2  tr  |L^Q'B_lffP  —  I”  jlJ  j 
+  2(pifrR_1Q*-r^Ti,+2SGB-1[‘t'r(s  +yj  +  tr’^BLV j|  ;  n (^)  =  ^  (58) 

v  =  (yHTR~vHP  -ft)v  +v(phtr'xh  -f'j  -<t>  -2(sr+rrsj -HPHTR~lH 
-  HtR ~ lHPH  +2Y^p(^HTR-lQ’+nT' R~lHj  -rj  +2j(nrR_l/f  +HTR~lQ"jp 

-rr|y  +  tr'|2(pHrR"lfl’-rjpR-ER|  ;  v(tf)  =  Yf  (59) 

H  =  (sGB_1GT-Fr)H  +h(gB~1GTS  ~f)  -0  +  2s('P’B~lGrS  ~rj 

+  2(sGB-lVr-rr)s  — SGB~l0B~lGrS  ;  H (tfJ  =  Hf  (60) 


3  =  i(sGfl-lGr-.Pr^2  +Sr)  -2tfVr-2(r'iVy-2F'<;+SGB--1{2[(V 
+  ntvt  +( vrAr)']  +(l'+ l  t)'<  + 1  (eir  *e)* + ( es*  le)r  -2M*|  "b  ~  l(c 
+  Gr{) j  ’  +  [  IG  +  SL  T ■ +  2(vH‘ ' )' - ( 4G  +  2S'P’)b ~  l9 ] B ' l^Gr<  +  c) 
-  SGB-10B_l^Gr2V  +Zr+2'Pr<^  +  [[ro  +SL'T+2(vH,y 


-  (HG  + 2S,P"  ^B  ~  *0 1 B  ~ GTi,  +  c j  -SGB-I0B~l^GrAf  +Zr  +2Vr^)r 
+  |(tfrG  +Z  +  2¥'{)b~1(hG  +2Sv)'  +  (GrH  +2S¥r)B~l(ortf +ZT+2Vr<)|'  ; 

s(*,)  =  0  (61) 
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R  =  (htR~1HP -FT  )R  +  r( PHTR  ~  -Fj  +  J  (htR~'HP  -Ft  )r 
+  R^PHtR-1H-f')  *  +Y®HTR-lH  +htr~'h®y 
+  (y®HtR-1H  +HTR~lH®YjT:  r(^)  =  0 

B  =  (htR~1HP  -Ft)b  +  | (htR~1HP  -FrjB|‘-  U<8>HTR-ln}T 
-  ^(u®HTR-lHjr  :  b(*,)  =  0 


1=  -G-(FTr  +l’F  +¥"S  +  SF*r)’-HT'r-r'H’-( I'M  Jr 

[sGB~‘[(eB"l9j-t-(eB~lejr-2M|  +rc  +SL*r 
+  2(  VM'  j'  j  B •  lGrS  )* +  J  ( [  rc  +  SLmT  +  2(  H>'H'  j*  |  B ~  *GrS  |'r 
+  ^[[re  ^SL*r+2(vM')’|B~lGrs|r+  J  {[rc  +SL’r 


Z^VH’ j"  |i 


HG  +  S<r 


M 


HG  +  2SV' 


+  |(^HG+SH»,yTB_l(HG+2SV'||r:  l(/^  =  A  (64) 

Of  course,  L  and  A  denote  the  prior  expected  value  approximations  throughout  Equations  33 
through  64.  Since  these  approximations  are  deterministic  time  functions  (whose 
computation  is  described  in  the  next  section),  so  are  the  cost-function  coefficients  determined 
by  integration  of  Equations  51  through  64  in  reverse  time.  From  the  form  of  these  equations, 
these  coefficients  also  have  the  symmetry  properties  and  orders  of  magnitude  postulated  for 
them  in  Equation  18.  The  latter  follows  from  the  fact  that  each  of  them  except  U ,  $,  and  o  is 
multiplied  in  the  differential  equation  generating  it  by  a  coefficient  matrix  which,  from  the 
assumed  observability  and  controllability  properties  of  the  corresponding  LQG  problem, 
make  that  equation  stable  in  reverse  time  with  fast  enough  damping  that  the  effects  of  the 
driving  terms  can  only  accumulate  over  time  intervals  of  order  unity.  Hence, 
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the  orders  of  magnitude  of  these  quantities  are  the  same  as^those  of  the  driving  terms  in  the 
corresponding  differential  equations.  The  equations  for  U,  cj>  (and  also  the  one  not  shown  for 
6)  have  no  such  damping,  so  their  orders  of  magnitude  are  in  general  l/h.2  times  those  of 
their  driving  terms  for  integration  intervals  (i.e.t  t,  —  tQ)  of  this  duration.  From  inspection  of 
the  driving  terms  in  Equations  51  through  64,  ana  also  those  in  the  corresponding  equation 
for  o,  the  previously  assumed  order  of  magnitude  is  obtained  in  each  case. 

As  a  result,  the  function  J  of  Equation  1 8  satisfies  to  order  h 2  the  Bellman  equation. 
Equation  32,  and  the  boundary  condition.  Equation  33,  for  the  optimal  conditional  expected 
cost-to-go  function  when  the  coefficients  are  determined  by  Equations  19  and  20,  51  through 
64,  and  the  similar  ordinary  differential  equation  not  shown  for  alt).  As  a  further 
consequence  of  the  Principle  of  Optimality  of  dynamic  programming  which  leads  to  the 
Bellman  equation,  the  corresponding  control  law  specified  by  Equations  46  through  49 
approximates  the  optimal  control  law  asymptotically  to  order  h2.  These  conclusions  are  only 
of  a  formal  nature,  however,  and  were  reached  with  the  tacit  assumption  that  the  context 
here  is  sufficiently  regular  and  well-posed  that  (1)  the  higher-order  (than  h 2)  errors 
introduced  by  the  approximations  of  the  Bellman  equation  made  here  only  change  the 
solutions,  and  the  corresponding  optimal  control,  to  higher  order  in  some  appropriate  sense; 
(2)  for  h-* 0  the  optimal  value  function  is  a  neighboring  function,  again  in  some  appropriate 
sense,  of  that  of  the  corresponding  LQG  problem  (which  is  well  known);  and  (3)  there  is  only 
one  neighboring  solution  of  the  (exact)  Bellman  equation  for  the  problem  at  hand  for  h-*0 
(i.e.,  the  neighboring  solution  obtained  here). 

Actually,  the  derivation  of  this  section  would  remain  formally  valid  for  any  specific 
application  as  long  as  the  quantities  deleted  as  being  of  higher  order  than  h 2  remain  small 
compared  to  those  retained,  and,  of  course,  the  state  estimation  results  are  valid  as  described 
at  the  end  of  Section  IV.  These  deletions  are  too  numerous  to  list  all  the  resulting  conditions 
for  validity  explicitly.  In  particular,  however,  when  Equation  50,  via  Equations  46  through 
48  for  «0,  u,  and  u^,  was  substituted  for  the  u-dependent  terms  arising  from  the  Bellman 
equation,  the  terras  u^Bu^  and  u^Bu^  were  deleted  whereas  uQTBu0  was  retained.  Since  B 
is  positive-definite,  this  means  that  these  conditions  must  certainly  include  the  requirement 
that  the  prior  expected  values  of  UjTu^  and  remain  small  compared  to  that  of  Uq tu0, 
which  is  essentially  a  condition  on  the  perturbation  terms  in  the  asymptotically  optimal 
control  law  being  small  compared  to  the  dominant  terms. 


VI.  OPTIMAL  CONTROL  LAW  APPROXIMATIONS 

In  order  to  determine  (to  order  A2)  the  control  law  which  satisfies  the  optimality 
conditions,  it  still  remains  to  find  the  prior  expected  values  of  L  and  A  (as  given  by 
Equations  13  and  14)  when  this  control  law  is  used.  Since  the  boundary  conditions  of  these 
equations  are  specified  at  fQ  and  those  of  Equations  51  through  64  are  specified  at  t^,  solving 
these  equations  might  seem  to  involve  the  solution  of  a  two-point  boundary  value  problem. 
Because  of  the  structure  of  these  equation  systems  and  the  orders  of  magnitude  of  the 
variables  involved,  however,  it  is  possible  to  evaluate  these  prior  expected  values  to  the 
desired  degree  of  asymptotic  accuracy  by  solving  only  initial  value  systems  of  (ordinary) 
differential  equations.  To  do  this,  iirst  define  the  following  prior  expected  values  under  the 
optimal  control  law; 
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x  =  E(x)  =  E(x) 

(65) 

E  =  ElE) 

(66) 

-Ixr 

(67) 

e|(e-e)(£-x  )t|  =  E  (e£tJ 

(68) 

=  e[(b-I  j|  =  E 

-i®I  . 

(69) 

These  are  all  deterministic  time  functions  with  the  time  argument  t  suppressed  in  the 
notation. 

By  Equations  46  and  47,  the  optimal  control  law  gives  the  control  as 

«  =  aQ= -B-1|cr(si+<j+cj  (70) 

to  order  unity,  so  from  Equation  9  for  a  generic  time  t  and  the  fact  noted  earlier  that  E(z-z) 
=  0  there  conditioned  on  the  data  up  to  time  t, 

E/<iato<t,Kf+8/)|  =^  +  [(^'GB-1Grs)i-GB-l(GrC+c)|5/  <71> 

to  order  St  for  a  very  short  time  increment  St,  where  the  quantities  on  the  right-hand  side  are 
all  evaluated  at  time  t.  Taking  prior  expectations  of  both  sides  of  Equation  71  and  using 
definitions  65  through  69  gives,  in  the  continuous- time  limit, 

l  =  (f -GB~1GTs)x  -GB~1(gt<  +c)  ;  x(t0J  =  £0  (72) 

to  order  unity.  Similarly, 

£  =  (f -PHTR~XH^JE  -2PHtR~1KL  +  2[(Lr-£,a'R-!HJ»)' 

-  (sGB*lVZ,j'j  x  -2(vL^’B-l^GTi+cj  ;  sfo)  =  0  (73) 

to  order  h,  where,  for  this  accuracy,  L  can  be  regarded  as  the  prior  expected  value 
approximation  being  sought.  Since,  as  noted  earlier. 
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e[('-jX*-jH=s(‘) 

in  Equations  9  through  12  in  this  context,  it  also  follows  from  Equations  9  and  TO  that 

■wK'  +8'M‘+s‘)I  - (((f  -ob-'g’-s); 

-GB-1(gt{+c)]xt+£[  iT(  Ft  -  SGB  -  xGt  j 

-  (<rG  +cTjfl“  *Gr]  +  PHtR  ~  lHP  6 1  (74) 

to  order  unity.  Taking  prior  expectations,  using  definitions  65  through  69,  taking  the 
continuous-time  limit  and  subtracting 

7,(yzT) 

using  Equation  72  gives 

X  =  (f  -GB~*GrS  jx  +x(ft-SGB~1GT^J  +PHTR~lHP;  x(*0)  =  0  (75) 

to  order  unity.  From  analogous  derivations, 

♦  =  (f  -PHTR~lH^i  +^Fr-SGB-lGT)  +  2[(l.r-I.Q’R-lffp)' 

-(sCfl-lW.y|x  +2^/<lHT  +  [lQTP^\r-1HP  ;  $(f0)  =  0  (76) 

to  order  k,  and 

E  =  (V  -PHtR~1R) E  +E (ft-HtR-1HP^  +2[^Lr-£,Q’B“1HPy 
-  (sGB_Vi,y|*r+2<*|(rrL  -PHtR~1Q'tlJ -(^LVTB~lGTS^i  | 

4[AHT+(LQrp)'jR_l[ffA  +(«»^)*]  :  e(*0)  =  0  (77) 
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to  order  A2,  where  L  again  denotes  the  prior  expected  value  approximation  and  A  now 
denotes  the  approximation  (accurate  to  order  A)  obtained  from  Equation  14  with  this 
approximation  for  L  and  with  D  and  9  replaced  by  zero. 

Taking  prior  expectations  in  Equation  13  and  using  definitions  65  through  69  then 
shows  that  the  prior  expected  value  approximation  L  satisfies  the  deterministic  differential 
equation 


L  =  z2Q2~{  ETHT  +  2LKT^R~1(h E  +  2/U.)  -  tr  ^  HTR  ~  lHE  j 
-  2Ltr|QrR~ltf(  Exr  +  <Sj|  -  2tr  ( ExT  +  ^flrR~lH  L 


-4 \L 


„{ktr 


-1Q'r  +  Q'R 


\L\  x  —  4L  tr 


QrV 


(78) 


to  order  A2  Over  control  intervals  t^—  tQ  of  order  1/A2,  therefore,  this  equation  generates  (the 
prior  expected  of)  L  to  within  an  error  which  is  formally  of  order  A  with  virtual  certainty. 
This  is  the  same  order  of  accuracy  with  which  this  prior  expectation  approximates  the 
sample  values  of  L  generated  by  integrating  equation  system  5  through  16. 


Consequently,  an  approximation  of  the  optimal  control  law  which  has  the  formal 
appearance  of  being  asymptotically  accurate  to  second  order  in  A  (as  A  goes  to  zero)  can  be 
computed  and  implemented  as  follows: 


1.  Integrate  Equation  8  for  Pit )  in  forward  time  from  tQ  to  t^,  and  Equations  19  and  20 
for  Sit)  and  £(t)  in  reverse  time  from  f^to  /Q. 

2.  Using  the  results  of  Step  l,  integrate  Equations  72, 73,  and  75  through  78  for 


x  W,  E  it),  X(t),  <**<),  E(t),  and  Lit) 

in  forward  time  from  (Q  to  ^.together  with  the  approximate  equation 

A  =  (f-PHtR~1hJa  +  A (ft-HtR~1HP^  +  (prL^'-h  (LrTpj 

-  PHTR  ~ ^PQL^j -  (lQP^'R  ~ lHP  +LL"  ;  A^)  =  0  (79) 

for  A (t). 

3.  Using  the  results  of  Steps  1  and  2,  integrate  Equations  51  through  64  for  the  value 
function  coefficients  in  reverse  time  from  to  (Q. 

4.  Implement  the  control  law  by  using  these  coefficients  and  L  from  Step  2  in  Equations 
46  through  49  for  the  control  value  u,  while  integrating  equation  system  9  through 
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A 

16  in  real  time  to  generate  x,  0,  D,  and  E  (which  are  also  needed  to  evaluate 
Equations  46  through  49)  from  the  incoming  measurements  z. 

Actually,  the  same  order  of  asymptotic  accuracy  canAbe  obtained  by  using  the  result  of 
Equations  78  and  79  for  L  and  A  in  the  generation  of  x,  8,  D  and  E  in  Step  4.  In  this  case,  only 
Equations  9  through  12  need  be  integrated  in  real  time  to  implement  the  control  law, 
together,  of  course,  with  the  real-time  evaluation  of  Equations  46  through  49  for  u.  However, 
the  results  here  are  of  greatest  significance  when  A  is  as  large  as  meaningfully  possible,  in 
which  case  these  two  alternative  implementations  may  have  markedly  different  behaviors. 
The  second  one  might  be  less  responsive  since  it  uses  prior  expectations  of  L  and  A  in  place  of 
their  actual  sample  values.  On  the  other  hand,  it  might  be  more  robust,  in  the  sense  of 
working  better  for  larger  values  of  A,  since  L  ~ 1  is  used  in  generating  D  or  M  via  Equation  1 1 
or  17. 

It  is  possible,  by  retaining  a  higher  order  of  asymptotic  accuracy  in  an  analysis  similar  to 
that  leading  to  Equations  72  through  78,  to  evaluate  the  average  sizes  and  effects  of  the 
higher-order  terms  in  the  optimal  control  law.  The  analogous  equations  become 
considerably  more  complicated,  however,  and  such  an  analysis  is  beyond  the  scope  of  this 
report. 


VII.  INCLUSION  OF  PRECISELY  KNOWN  STATE  VECTOR  COMPONENTS 


In  some  applications  of  interest,  the  motion-state  vector  x  can  be  partitioned  as 


such  that  the  dynamics  of  Equation  1  reduce  to  the  partition  equations 

*i  =  fVcl+*Vt2+Gi“  +2tr(ri0x|,j  +2tr(r20x2j 

+  2tr('P10ur)  +  tr^FjOO^Xj-b  tr^F200r\t2  +  (l  +  £  ,0  )  u>x  <80> 

and  /  \  / 

*2=F3X2  +  03“+“'3-  (81) 

the  measurement  equation  is 

z  =  fflxI  +  H2x2+2K0  +2tr(ni0xf)  -^tr^Oxj) 

+  tr(H100r)x1+  tr(H20eT)x2+  tr( A00r)  +v  ,  (82) 
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and  where  the  controller  knows  the  current  value  of  x2  exactly.  The  process  noise  partitions 
u>t  and  u>3  are  uncorrelated,  the  covariance- matrix  parameter  R  of  the  measurement  noise  v 
is  positive-definite  as  before,  and  the  order-of- magnitude  restrictions  on  the  parameters  is 
the  same  as  before. 

Since  x,  is  known  precisely,  the  joint  conditional  probability  distribution  of  x  and  9  is 
singular,  and  the  analysis  of  Section  IV  does  not  apply  directly  to  this  case.  The  same 
procedure  can  be  applied  to  Equations  SO,  2,  and  82,  however,  to  derive  similar  recursive 
equations  for  generating  the  same  type  of  second-order  Edgeworth  approximation  to 
pUpB/y)  that  is  asymptotically  accurate  to  order  h2.  Since  the  current  value  of  x2  is  available 
as  well  as  that  of  u,  the  transition  of  the  reduced  composite  state  (xt,0)  from  /  to  t  +  8t  can  be 
approximated  by  a  sequence  of  four  transformations  of  the  same  form  as  those  described  for 
(x,0)  in  Section  IV.  Furthermore,  Equation  82  is  equivalent  to 


=  H  jXj  +  2 


(<r+0*) 


10  +2  tr 


tr(H100r)x1-t- tr  (a  -l-H^jee7 


+  v 


Since  x2  can  be  treated  as  a  known  parameter  here,  this  is  a  measurement  equation  for  x. 
and  8  that  has  the  same  form  as  Equation  3  for  x  and  8,  with  (x  —  H2x2)  playing  the  role  of 
the  measurement  variable.  Thus  the  derivations  of  Section  IV  can  be  applied  in  this  context 
with  no  change  other  than  a  redefinition  of  variables. 


Not  surprisingly,  however,  the  result  reduces  to  what  Equations  5  through  17  would  be 
if  constructed  for  the  entire  composite  state  vector 


except  for  the  following  modifications: 

1.  The  differential  equations  are  not  integrated  for  components  of  the  conditional 
moment  parameters  involving  components  of  x2- 

2.  In  the  other  differential  equations,  all  occurrences  of  conditional  mean  components 
of  x2  are  replaced  by  the  corresponding  (known)  component  of  x2  (i.e.,  x2  is  replaced 
by  x2);  all  occurrences  of  the  higher  conditional  (central)  moment  components 
referring  to  components  of  x2  are  replaced  by  zero. 

As  a  result,  the  derivations  of  Sections  V  and  VI  for  the  asymptotic  approximation  of  the 
optimal  control  law  all  go  through  for  this  case  with  these  modifications. 
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VIII.  EXAMPLE 


The  use  of  this  theory  can  be  illustrated  by  applying  it  to  an  idealization  of  a  planar 
intercept  problem  in  which  a  constant-speed  interceptor  can  accelerate  perpendicular  to  its 
flight  path  and  receives  noisy  angular  measurements  of  the  line  of  sight  to  a  maneuvering 
target.  Although  this  example  is  meant  to  resemble  the  guidance  of  a  homing  missile,  this 
resemblance  is  so  highly  simplified  and  distorted  that  the  results  obtained  for  it  should  not 
be  taken  as  valid  for  guidance  law  design  in  any  actual  application  of  this  sort.  Rather,  these 
results  are  meant  to  indicate  the  qualitative  type  of  information  that  should  be  available 
from  a  reasonably  realistic  analysis  of  such  a  situation  and  the  basic  form  this  analysis 
might  take,  without  having  to  deal  with  the  extra  complexity  that  would  be  required.  In 
summary,  the  result  of  these  simplifications  is  the  following  problem,  in  which  all  the 
variables  are  scalars. 


Dynamics: 


Measurements: 


x  =  —  t+sj^ucosy  +  u>cJ 


0  =  “-(usiny  +w.J 


Criterion: 


+  n 


J  = 


-E 

ax2  + 

[Vdt 

2 

f 

I'o 

a  >  0 


In  this  formulation,  we,  t»t,  and  n  are  jointly  a  zero-mean  Gaussian  white  noise  process  with 
covariance  matrix  parameter 


and 


q  0  0 
0  q  0 
0  0  R 


R 

<7 


2  Vqritf-t)2 
AS2 
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V 


Y 

r 


a 

A 

0U) 


closing  speed  (assumed  large) 
time  of  intercept 

angle  of  interceptor’s  velocity  from 
its  line  of  sight  to  the  target 


Parameters  of  nominal  intercept, 
in  which  both  vehicles  move  in 
straight  lines  at  constant  speed 
and  y  is  constant. 


variance  parameter  of  angle  measurements  being  approximated  (assumed  to 
have  the  characteristics  of  radar  glint  noise) 


r.m.s.  target  ( linear)  acceleration  —  assumed  isotropic 
average  time  between  target  acceleration  changes 

T(t)  —  tf  \  TU)  =  "projected"  time  of  closest  approach  (i.e.,  if  neither  vehicle 
accelerated  after  time  t) 


xU)  =  projected  distance  of  closest  approach  at  timet 

Although  this  problem  might  not  appear  to  have  exactly  the  form  of  those  problems 
treated  above,  this  problem  essentially  does  have  the  same  form  because  V  is  typically  large 
(compared  to  velocity  changes  resulting  from  interceptor  and  target  maneuvers)  so  that  a 
new  parameter  variable  9*  =  VV9  could  be  used  in  place  of  9,  in  which  case  the  dynamics 
and  measurement  equations  would  all  have  the  proper  form,  except  for  very  small  values  of 
—t.  Then,  however,  the  form  is  unimportant  because  the  interceptor  is  so  close  to  the 
target  that  it  no  longer  has  much  control  over  the  outcome.  The  use  of  9  rather  than  9*  here 
basically  has  the  effect  of  making  9/i  itself  a  quantity  of  order  h,  rather  than  some  of  the 
other  parameters.  Although  the  noise  in  the  (linear)  measurements  of  x  here  has  been 
defined  to  approximate  the  effects  of  radar  glint  noise  in  the  angular  measurements  of  a 
homing  missile,  the  interceptor  cannot  measure  current  target  range  in  this  example,  which 
might  correspond  to  the  radar  being  jammed.  The  choice  of  performance  criterion  (for  any 
reasonable  value  of  a)  is  such  that  if  this  range  were  precisely  known,  the  optimal  control 
law  would  reduce  to  the  standard  proportional  navigation  taw  (with  an  effective  navigation 
gain  of  3)  operating  on  a  conditional- mean  estimate  of  the  line-of-sight  rotation  rate,  except 
very  near  the  beginning  and  end  of  the  encounter.  In  the  idealized  context  of  this  example, 
therefore,  it  is  reasonable  to  view  the  midcourse  portion  of  the  optimal  control  law  as  the 
way  that  proportional  navigation  guidance  should  be  modified  for  the  absence  of  homing 
range  data  when  the  line-of-sight  measurements  (or  their  equivalent)  are  noise-corrupted. 

For  simplicity,  only  the  case  of  y  =  x(tQ)  =  0  is  considered  _m  further  detail.  In  the 
notation  of  the  preceding  sections,  the  variables  F,  T,  F,  L,  T,  K,  H,  A,  A,  H^.,  Z,  A,  c,  G,  M,  <&, 
and  6  are  all  zero  in  this  case,  and 


H  =  B  =  1, 

G  =  tf~t, 

Q  =  e2V2Q2  =  q(trt)2, 
R  =  2  Uf-t)2y/qr, 
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V  =  1/2 , 

—  1 
£  =  - ,  and 

v-* 

Q  =  -  - - - . 

2Uf-t) 

As  a  result,  the  cost-to-go  function  coefficients  q,  4,  V,  AC  and  2  are  zero,  and  $  is  not  used.  All 
the  variables  in  this  case  are  scalars,  of  course,  and  the  equations  which  must  be  solved  in 
real  time  to  implement  the  (asymptotically)  optimal  control  law,  called  a  guidance  law  in 
this  application,  become  the  following  when  1  is  used  to  denote  —  t: 

z  =  S-(x0+EV t  (83) 


i  =  (x  +e)u  +  (p  +2Z>)(l  — 0/t )  +  2A  j/x  ( 


L.) 

2  xV^' 


i.[*-(te+SL>,K^) 

D  =  Eu  +  -~z.|  ^(ep  +x£  +2A  j/x  -^1  -0/x)d  -(d  -  j0P/x)2j 
E  =  Lu  +(  -£~.)JpE0/x  -(  ^P  +£>)(e  -xLIx)  +  [a(i  -0/xj 

-(ip+o)t/.|(.-j)) 


(84) 


(85) 


(86) 


(87) 


u=  -[(s  -»-H0j(t+0)+(ir  +  I02)x+nL]i?-(s  +r+nx  +  BL)fi  .  (88) 

Equations  84  through  87  are  integrated  (with  substitution  from  Equation  83  for  z)  from  time 
tQ  with  zero  initial  conditions,  and  the  remaining  variables  would  be  obtained  from  the 
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stored  solution  of  a  determinisitic  ordinary  differential  equation  system  for  P,  L,  X,  4>,  E, 
S,  W,  Y,  u,  n,  V,  H,  R,  B,  and  I,  as  described  in  Section  VI. 


MIDCOURSE  GUIDANCE  APPROXIMATION 


This  last  equation  system  is  not  shown  in  detail  because  the  optimal  guidance  law  in  this 
particular  case  is  quite  well  approximated,  except  very  near  the  initial  and  final  times,  by 
one  which  can  be  determined  much  more  simply.  A  common  practice  in  such  cases.is  to  use 
this  midcourse  approximation  for  the  entire  engagement.  The  equation  for  S  can  be 
integrated  analytically  to  give 

S  =  1/^1/a  +ta/3  j,  (89) 

which  reduces  approximately  to 

S  =  3/t3  <901 

during  this  midcourse  phase  and,  after  considerable  manipulation  and  use  of  the  fact  that 
0<  <  t,  leads  to  further  approximations 


^Ux/v2-3/(i+e)2| 

A 

X  , 

(91) 

P  =  y/2q  Vqrx2  , 

(92) 

\  =  2qV/3rcr/V  , 

(93) 

=  V6V2q  (r/qW/V 

(94) 

then.  These  approximations  allow  the  optimal  control  law  to  be  specified  by  Equations  83 
through  87  and  91  through  94  except  for  determining  U  in  Equation  91.  The  equation  for  U, 
converted  to  reverse  time  t,  approaches 


(95) 


during  the  midcourse  phase,  which  implies  that  the  midcourse  approximation  for  U(t)  is 
simply  a  constant.  The  value  of  this  constant  (with  respect  to  i  or  x)  is  a  function  of  the 
parameters  specifying  the  problem,  however,  and  must  be  found  by  numerical  integration, 
which  reduces  to  integrating 


P.qx2- 


2xV^ 


;H‘.) 


from  Equation  92 
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in  forward  time  to  from  some  well  within  the  midcourse  region  and  then,  using  the 
stored  result  and  Equation  89  for  S,  integrating  the  following  system  of  equations  in  reverse 
time  from  t^-to 


PY 


2  2 


Y  =  - - S\ 


:  v(«,)-0 


H  =  2St[s  =  0 

i  =  2 +Itj  +fs  +  Hx)2  ;  1^  j  =  0 

"W*0 

7= =  n 


b.-z=! 

2\.*>/qr 


(v  +  2Ht-2V* -It2) -q(s  +  y +2Vt)  ;  =  0 


It  is  clear  from  this  description  that  the  desired  value  U(t{)  depends  only  on  the  parameters 
a,  q  and  r,  and  the  constraints  of  dimensional  analysis  imply  that  this  dependency  (if 
reasonably  well-behaved)  can  be  expressed  in  the  form 


It  is  a  simple  matter  to  determine  /’empirically,  and  the  result  is  shown  in  Figure  1. 
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FIGURE  1.  Empirical  Determination  of  f. 
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As  a  final  practical  point  of  implementation,  this  midcourse  guidance  law  approximation 
would  probably  be  more  robust  if  aH  occurrences  of  x  in  Equations  83  through  87  and  91 
through  94,  except  within  the  "(x  +•  0)”  factors  of  Equations  84  and  91,  were  replaced  by  t  + 
9,  since  9  <  <  x  and  since  x  +  0  is  the  current  estimate  of  the  time-to-go  for  closest  approach 
whereas  x  is  only  the  prior  estimate.  It  is  also  interesting  to  note  that  the  line-of-sight 
rotation  rate  o>  is  given  by 


x 

<0=  - - 

V(x  +  0)2 

A 

for  small  o>  (and  the  purpose  of  the  guidance  is  to  keep  it  small).  Since  0  <  <  x  and  0  <  <  x,  this 
means  that  Equation  91  is  approximately  equivalent  to 


-U(x  +  9>7V  -  3V 
2 


(96) 


where  co  is  the  conditional-mean  estimate  of  u».  This  is  a  form  of  proportional  navigation 
where  the  navigation  gain  is  modified  by  a  range-dependent  term  if  U  *  0. 


LIMITS  OF  VALIDITY 


7 

The  midcourse  guidance  law  approximation  clearly  breaks  down  unless  0  <  <  x,  L  <  <  xz 
(or  0<  <x),  and  the  supposedly  dominant  terms  in  Equation  88  for  the  control  are  large 
compared  to  the  perturbation  terms,  and  these^appear  to  be  the  only  restrictions.  Since  the 
decomposition  of  the  coefficient  matrix  for  the  xXj  terms  of  the  cost-to-go  function  Equation 
18  into  S  +  W  was  made  arbitrarily  for  convenience  in  the  analysis  of  Section  V,  the 
dominant  terms  in  Equation  88  should  really  be  taken  as  (S  +  W)ix.  These  happen  to  be  thg 
only  nonzero  terms  remaining  in  the  midcourse  approximation  of  Equation  91,  however,  if  0 
is  ignored  as  relatively  small,  so  the  last  restriction  mentioned  on  validity  is  basically  that 
the  coefficient  of  x  in  Equation  91  remain  nonzero.  From  Figure  1,  U  is  generally  positive,  so 
this  conditio^  will  be  violated  when  x  becomes  larger  than  some  critical  value  x*. 
Disregarding  0/x  as  negligible  in  Equation  91  shows  that 

*-(£)" 

This  equation  suggests  that  for  engagements  such  that 

/2V2\i/3 

trto>\  u  ) 

the  "perturbations”  in  this  analysis  aren’t  relatively  small  any  more  and  the  optimal  control 
assumes  a  drastically  different  form.  The  navigation  gain  of  Equation  96  becomes  negative 
when  this  condition  is  reached,  meaning  that  the  interceptor  would  deliberately  steer  away 
from  the  target  (in  relative  motion  space)  instead  of  homing  on  it.  This  may  indicate  that  it 
becomes  to  the  interceptor’s  advantage  under  these  conditions  to  approach  the  target  on  a 
highly  curved  course  to  create  a  kind  of  synthetic  parallax  on  which  to  base  estimates  of 
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current  target  range.  This  tactic  has  been  proposed  before  (Reference  5)  and  has  been  shown 
to  enhance  an  interceptor’s  ability  to  estimate  range  from  angle  data,  although  the  utility  of 
this  enhancement  was  not  analyzed.  The  results  here  indicate  that  it  can  be  advantageous 
even  when  its  only  purpose  is  to  enable  better  estimates  of  x  (or  equivalently  the  line-of- 
sight  rate  o>)  to  be  made.  This  is  because  U  -*0  when  the  measurement  noise  intensity  r 
approaches  zero  (see  Figure  l),  and  hence  Equation  96  reduces  to  standard  proportional 
navigation  since  w  can  be  estimated  exactly  under  these  conditions. 

For  parameter  values  typical  of  a  radar-guided  homing  missile,  however,  values  of  i* 
can  occur  which  are  within  its  engagement  envelope,  within  the  region  of  validity  of  the 
midcourse  guidance  approximation,  and  in  the  range  where  glint  is  normally  the 
predominant  source  of  angle-measurement  noise.  Thus  there  is  good  reason  to  hope  that  this 
sort  of  analysis  could  be  applied  to  a  reasonably  realistic  problem  formulation  to  provide 
some  meaningful  information  about  the  guidance  of  such  missiles,  at  least  to  the  extent  of 
indicating  conditions  under  which  radical  departures  from  "proportional  navigation” 
guidance  might  be  advantageous. 
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Appendix  A. 

INDUCTION  STEP  COMPUTATIONS  FOR 
SCALAR  STATE  ESTIMATION 


For  the  case  of  scalar  x  and  8,  the  second-order  Edgeworth  form  assumed  by  their  joint 
conditional  density  (to  order  h 2)  reduces  to 


p( x,8) 


2n  VML-E2 


1  + 


(-H 

(8  — 8)| 

(x  — x)2— Afl  -27(1-*) 

(8  -8)2-L  -2(x-£)e) 

1  L 

1  / 

+  (-L 

\x-x) 

(0—  0)2— A 

(x  — x)2— Af  I  f(8  — 8)2— l| 

'  ML 2 

/ 

1 

II  1 

+  -(—  -0)2[(x  -x)4-6Af(x  -£)2+3M2| )  (A-l) 

2\M2L/  II 


at  a  generic  time  t,  where  the  conditioning  on  the  measurement  history  available  at  that 
timers  suppressed  in  the  notation,  and  where  E  and  \  are  of  order  h,  i,  and  p  are  of  order  h2, 
and  6,  M,  L,  and  x  are  of  order  unity.  (,  and  p  here  take  the  place  of  V  and  N  in  the  notation  for 
the  multivariate  case.  Under  the  hypothesis  that  x  and  8  have  the  joint  density  (Equation  A- 
1)  to  order  h2,  the  joint  density  is  computed  to  order  A28f  for  the  result  of  applying  each  of  the 
four  transformations  described  in  Section  IV  to  x  and  8,  and  for  conditioning  x  and  8  on  the 
measurements  (also  restricted  to  the  scalar  case)  in  a  time  interval  (t,t  +  8 f).  The  result  in 
each  case  will  be  another  density  of  the  form  of  Equation  A-l,  so  the  result  for  the  entire 
induction  step  will  be  that  the  joint  density  remains  of  this  form,  with  parameter  changes 
that  can  be  computed  to  order  h2St  as  the  composition  of  the  changes  resulting  from  each  of 
these  individual  transformation  and  conditioning  operations.  The  effects  of  any  higher-order 
(than  h2)  error  in  the  density  approximation  Equation  A-l  are  not  considered  in  these 
computations,  and  are  tacitly  assumed  to  be  of  higher  order  in  h  than  h2it  for  the  sort  of 
reasons  described  in  Reference  6.  Also,  the  time  increment  8t  is  denoted  by  A  for  convenience 
in  these  computations. 


A  A 

Without  loss  of  generality  at  this  point,  only  the  cqpe  of  x  =  8  =  0  is  treated  in  these 
induction  step  computations.  In  the  case  of  nonzero  x  or  8  jjt  any  particular  point  in  time,  the 
dynamics  and  measurement  equations  for  (x  -  x)  and  (8—8)  reduce  to  the  same  form  as  tjiose 
for  x  and  8.  Hence  the  same  induction  step  computations  can  be  applied  to  p(x-x,8-8)  at 
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that  time  instant  with  a  redefinition  of  the  parameters.  The  desired  result  for^the  increment 
in  p(x,9)  can  then  be  obtained  by  merely  shifting  the  mean  of  the  density  by  (x,8) 


TRANSFORMATION  I 


y 

_ 

l  +  FA 

|  2*FuA 

x  1 

GuA 

n 

0 

!  i 

8  1 

TuA 

(A-2) 


where  a  is  a  known  quantity  (theAcurrent  control  chosen  by  the  controller),  and  let  p(x,8)  be 
given  by  Equation  A-l  (with  x  =  8  =  0).  Inverting  this  transformation  gives 


i  ;  2‘Pua" 

X 

1  *4-  FA  i  1  +  FA 

y  —GuA 

0 

!~rj 

q  —  TuA 

(A-3) 


Since  this  inverse  is  single-valued,  the  standard  result  for  transforming  probability 
densities  gives 


P  [e(o4i)4<  0,3)1 
Prxi(a,P)  “  If  i  +  FA  j  2*VuA 

rt'TT' 


(A-4) 


where  (c,d)  is  obtained  by  applying  the  inverse  transformation  A-3  to  (a,p).  Thus, 


p,^  =  T7fZp‘» 


a  -CuA  +2«FuA(P  -  TuA) 
1  +FA 


,P  -TuA 


(A-5) 


Straightforward  substitution  of  Equation  A-l,  which  is  really  (or  pI0(x,8),  for  p  Q  (with 
different  arguments)  in  Equation  A-5  shows  that  p(y,t\)  isAgiven  by'  Equation  A-l  with 
arguments  y  and  q,  and  with  different  parameter  values  x*,  8*,  M*,  and  E*  in  place  of  their 
unstarred  counterparts.  Except  for  terms  of  order  A2,  these  parameter  changes  are: 


2*  =  Gu  A 
9*  =  TuA 

M*  =  M  +  (2 FM  +  4*P£u)A 
E *  =  E  +  (FE  +  2VLu)A 


(A-6) 
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Dividing  the  polynomial  factor  of  Equation  A-9  by  (1  +  2T9A  +  F82A)  and  using  the 
definitions  of  a,  b  and  c  gives 


+ 


+  £(ri;+L“+2rfjM)|<>2-*' 


(A-10) 


to  order  h2 A.  Expanding  the  polynomial  factor  of 


y  -  L82uA 

i  +2  re  a  +F82a 


e) 


to  order  h2 A  gives  the  remaining  factor  in  Equation  A-7  to  this  accuracy  as 

+ ( ^)!e2-‘  I  ♦( -;/-vl  I62-1- 


■H 

82-l| 

•[e*-L 

+  LvJ 

(A-ll) 


To  order  A2  A,  therefore,  the  joint  density  p(y,0)  is  the  product  of  the  exponential  factor  in 
Equation  A-9  and  a  polynomial  which  is  itself  the  product  of  the  polynomials  A-10  and  A-ll. 
Multiplying  these  two  polynomials  gives  the  polynomial  factor  of  p(y,8)  to  order  A2 A  as 
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Comparing  this  with  the  exponential  factor  of  A-9  shows  that  their  product  p(y,9)  is  the  same 
as  px  0(y,8)  to  order  h2A  except  that  i*,  E*,  M*.  A*.  (,*  and  p*  replace  the  corresponding 
unstarred  parameters,  where 


x*  =  (2r E  +  LLu)A 
E*  =  E 

M*  =  M  +  (8r\  +  2F  LM)A 
A*  =  .\  +  2TMLA 
?  =  i,  +  L2LuA 
p*  =  p  +  (4m  +  FML2)A 


(A-12) 


to  this  order  of  accuracy. 


TRANSFORMATION  3 

Let  y  =  x  +  o>V37with  p(x,9)  given  by  Equation  A-l  and  with  w  conditioned  on  x  and  0 
having  a  zero-mean  Normal  distribution  with  variance  (1  +  oQ)2Q,  where  o  is  of  order  h. 
Then,  as  noted  in  Section  IV, 

p(y,0)  =  |  pxfi(y-wVA'fi)puji9(wjc,Q)dw  (A-13) 

The  integrand  in  Equation  A-13  is  the  product  of  two  exponential  factors  and  a  polynomial 
in  y,  w,  and  0.  The  product  of  the  exponential  factors  alone  can  be  written  as 


49 


NWC  TP  6530 


Va 


y-®V4 | 

9  i 

\  2nV ML -E2 

2Q(1  +oe»“A 


i  V2nQ(l 


+O0TA 


by  multiplying  and  dividing  by  VaT  Completing  the  square  in  the  exponent  shows  that  this 
product  is  also  equal  to 


except  for  negligible  terms  of  order  A3/2.  If  this  expression  is  substituted  for  the  two 
exponential  factors  in  the  integrand  of  Equation  A-13,  the  first  factor  of  Equation  A-14  can 
be  brought  outside  the  integration,  which  then  becomes 


i: 


[u>-a(y,9)VA|2 
2Q(1+o9>2  +  6<e>A 

€ 

V2n(Q(l  +a0)z+A(0)A| 


1  +  P(y—  wVK,  8 1  dw 


■0) 


(A-I5) 


where 


a(y,0)  = 


qu  +oer 

Att-E2 


Ly-E0 


m  = 


Q*L  A 

- -(1  +o0)4  , 

Aft-E2 


and  where  P(x,0)  denotes  all  terms  but  the  leading  one  in  the  polynomial  factor  of  density 
(Equation  A-l).  Except  for  terms  of  order  A3^,  a  Maciaurin  series  expansion  of  P(y— wVZT.O) 
in  w  gives 
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1  +  P< 


+ 


(,-.VSe)=i+(^)[e(A*)-2f,(0s-i 

*  K  mH.  )2ef/-«%!+3«2)-[  i  A rL  )(#-*) 

*  (“F  -2-T5  j(e!-1)+2( -fl  M®2-1-) 

v  m2l  m2l 2  /v  '  '  a*2l2  '  K  ' 

2(  ~r )  02(/-3^)U^r+  f  -^-y+  ~ti(92_l) 

K  M2L;  x  '  *  ‘M2L  Af2L2V  f 

+  3A^-^- j282(y3-3Myj|u;2A  . 


(A- 16) 


Substituting  this  equation  makes  Equation  A-15  the  integral  of  the  right-hand  side  of 
Equation  A-16  as  a  polynomial  in  w  multiplied  by  a  Normal  to-density  with 

mean  =  aiy,Q)VK 


and 

variance  =  Q(1 +o8)2  +  b(8)A  . 

This  integral  can  therefore  be  evaluated  by  applying  standard  results  for  the  first  and 
second  moments  of  Normal  random  variables,  the  result  being  simply  the  right-hand  side  of 
Equation  A-16  with  <z(y, 8)A  substituted  for  wVA  and  Q(l  +  o8rA  substituted  for  w2A,  except 
for  terms  of  order  A2 .  Using  the  definition  of  a(y,6)  in  this  substitution,  retaining  only  terms 
which  are  significant  to  order  A2 A,  and  rearranging  terms  gives  the  integral  (Equation 
A-15)  as 
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Now  define 


V  =  M+Q(1+o2L)A  , 

so  that 


Af  =  V— Q(1+o2L)A  . 

Substituting  Equation  A-19  into  Equation  A-17  and  expanding  to  order  A2A 
integral  of  Equation  A- 15  to  this  accuracy  as 


(A-17) 


(A-18) 

(A-19) 
gives  the 


(A-20) 
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Using  Equation  A-19  to  expand  the  first  exponential  factor  of  Equation  A- 14  to  order  A2  A 
gives 


(A-21) 


The  joint  density  p<y,0)  is  the  product  of  expressions  A-20  and  A-21  to  order  A2A.  Carrying 
out  this  multiplication  to  this  order  of  accuracy  and  rearranging  terms  shows  that  p<y,0)  is 
the  same  as  pz  0(y, 0)  with  the  parameters  M,  A  and  p  changed  respectively  to 


M*  =  if  +  Q(1  +<rt)  A 
A*  =  A  +  QLoA,  and 
p*  *  p  +  JQL2o2A 


(A-22) 


TRANSFORMATION  4 

This  is  a  special  case  (o  =  0)  of  Transformation  3  with  the  roles  of  x  and  6  interchanged 
and  with  Q  replaced  by  e2Q2,  which  is  of  order  A 2  The  same  method  can  be  applied,  but  most 
of  the  terms  are  zero  or  negligible  to  order  A2  A  in  this  case  and  the  only  change  in  the  joint 
density  is  that  the  L  parameter  changes  to 

L *  =  L  +  c2Q2 A  .  (A-23) 

CONDITIONING 

Let  pixfi)  have  the  joint  density  (Equation  A-I)  and  let 


z  =  x+2/f9  +  2O0x+H02x+De2  +  «;  (A-24) 

where  K  and  Q  are  of  order  h,  H  and  D  are  of  order  A2,  and  where  v  is  an  independent  (of  x 
and  0)  zero-mean  Normal  random  variable  with  variance  r/A,  with  r  of  order  unity.  For 
simplicity,  D  is  used  in  place  of  A  here  and  H  has  been  taken  as  unity  (which  represents  no 
loss  of  generality  in  this  scalar  case).  By  the  Bayes  rule, 

p(x,0/z)  *  p(zfx,0)p(x,0)  (A-25) 

as  a  function  of  x  and  0.  From  Equation  A-24  and  the  density  of  v. 


pb/xfi)  = 


.-tn 


*+2W+2Q«*+H e2*+oe2 


)l 


V2ntfA 


(A-26) 
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Expanding  Che  exponent  about  the  value  indicated  and  then  using  the  Maclaurin  series 
expansion  of  the  exponential  function  gives 


p(z!x,  8)  = 


\  2 
(f-i-2 KO) 

.  2 r 


V2n/fA 


A 

1  4*  * 
r 


2Q0x  +  H02x  +  D02J^z— x  — 2K0  jj  (A-27) 


except  for  zero-mean  random  terms  of  order  A  and  other  terms  of  order  A3/2  in  the 
polynomial  factor.  Forming  the  product  of  Equation  A-25  by  multiplying  Equations  A-27 
and  A-l,  and  using  standard  results  of  Kalman  filter  theory  to  complete  the  square  in  the 
resulting  exponential  factor,  gives 


***■■ "(I  Hi-  II 


\  2A  _  /  \ 

A  /  2  .  \ 

+ — Q0(  z-x 

1  +  —  0  x2-Af) 

/  r  '  / 

+  —  ^Hx+D^O^z— x  j 


'(A-28) 


to  order  h2  A  divided  by  the  order  of  the  integral  of  Equation  A-28  over  x  and  0,  where  N(aji) 
denotes  the  Normal  density  with  mean  a  and  covariance  matrix  A,  and  where 


V 

n 


Af+2  KE)z 


> 


*  ”  r(* 

0  =  *(e+2KL)x 

*  E-  ~(mE-¥2KML^ 
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<t> 


A/ 

=  L-;( 


E+2KL J 


The  integral  of  Equation  A-28  can  be  evaluated  by  applying  standard  results  for 
moments  of  a  Normal  distribution,  which  gives 

1  +  —  |  |2Q^  x©  +n)  +  DL]*  — 2fl(xri+8  Vr  +  2ffx(j> j  — HLvj  (A-29) 

to  order  ft2 A.  Since  this  integral  is  of  order  unity,  p(x, Q/z)  is  given  to  order  A2A  by  dividing 
the  right-hand  side  of  Equation  A-28  by  Equation  A-29.  Expressing  the  polynomial  factor  of 
Equation  A-28  in  terms  of  x,  9,  V,  q,  and  $  to  order  A2 A  and  dividing  by  Equation  A-29  gives 
1  +  PK,  where,  to  order  /i2A, 

^  |(9- 5  )[(*- ^  M-2;(‘- ;  )K8- 8  )2-*l  -2(- '  h 


+  K^)2(8-8)2[(‘-:)4-6K‘-:)2+3v: 

+  ;4a(5W^)l("')+2°;(8-9) 

+  [,+2(/f4.n;)4,](-^)[(,-;)J-vj+2(Q+^)[(,-;)(e-8)-, 

+(":*^-2£)[(8-5H 

^(;+2D1X'-:)!(9-8/-*hD*(^;)|()’-')2->r|[(e-5)2-* 


-  *  {[d$  +  2q(  x$  +  2q+2JC4>)j(*-  x  )+2QV^0-  0  j 
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^  +  20^0  +3^  j|  [(x-  x  )2-V'|+2Q^[(x-r  j(e-0  j-n| 
f  (hV  +  4/COx  )|(o-0  j2-<t>|+2(Q+-^  j|(e_e  j[(x-i)2-vj 

-2 ;(-  :X,e- •>*-♦)-*(»- :  )"Hd+  p  +“(k+  ;) 
+2K“+;X^)|(‘-:)|(9-5H+(h+^i 
6  ^)[('- :  f-v\  \(°- 5  )2-^i +2°:(  ^  i°- 5  f[(‘-  * ) 


-  3v(,-  ;  )]  +2(  ^  )(n + ^  )(e-  5  )!|(«-  ;  )*-«v(.-  ;  )%3  v!|  | 

The  first  and  second  moments  of  p(x,0/2)  can  now  be  evaluated  by  applying  standard  results 
for  moments  of  Normal  distributions.  The  results  are,  to  order  A*A, 


E(x)  =  jc  =  x  + 


20^ V 0  +  qx  +2xj—  -  -  |o4>  +  2Q^ x0  +  3q  +  2JT<l>^|  A 
E(0)  =  0=0  +20  j(xz-v)& 


var(x)  =  t=  V+2^  |2Qq  +  ^2Q$x  +2JT$+3qj^  JzA 

‘)l< 


cov 


-  2—  |HV$  +  2Q^qx  +  V9  +3A  HA 


(x,0)  =  a  =  q+  ^QVt  +  xjxA-2^Q«j)iA 


var (0)  =  (■>  =  #  +  *  |(+20q$  +  f  Hx  +D^2JxA-  2-  |hV$  +  2Qx  jq  +  f  2ff  +  0  x  j  |a 


(A- 30) 

(A-31) 

(A-32) 

(A-33) 

(A-34) 

(A-3S) 
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Now  let 


P  =  z  —  x,  a  =  0—  9  ,  e  =  t—  V,  s  =  a  —  q 
and  u  =  w  —  $ .  Using  Equations  A-31  through  A-35,  and  the  fact  that 

(tH 

except  for  (negligible)  zero-mean  random  quantities  of  order  A,  gives  the  following  result  to 
order  h2A: 

P  =  2  —  Q^t0+ax+2\^z—  —  t|Dw  +  2Q^?0  +  3a  +  2iiLM+4QGj3r4-2—  xjj 

a  =  2  —  Qoj|x2— 


:  =  2  —  /j2i2a+^2Q«ax+2/fw+3a  jz—  2  —  t|Hto+2Q^ax  +  /0  j 


+  to  4Q2+16a(  —  )+6f  — 
\  to/  \  to 


i  =  2  — to|^Q  +  —  ^z— Bx 


a  =  2^  [5+2Qa<a+^Hx+Z))«2]z-2^«2[H/+4n2/+2Q!¥-t-4Kflz-(-2^Q7+4Q^ 
By  construction. 


/J  x  1  |  V  n  1  \  C  21  [n+°l 

VI  0  1  U  ♦  1/  2n\/tt-t)(&>-u)-(a-*)z 


to  order  A3  A  (for  «U  z  of  ordor  A  ~ l/2),  where 


(A-36) 

(A-37) 

(A-38) 

(A-391 

(A-40) 

(A-41) 
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and 


A  - 
771  =  X  —  X 


n = 0-0  . 

A  Taylor  series  expansion  of  the  right-hand  side  of  Equation  A-41  shows  that 


w([*  ,\v  n|)=4[-|.|'° 

M  fl  n  d>  J  /  Mol  a  co  A  G) 


(A-42) 


to  order  A2 A,  where 

1  /  *  \2, 


p0  -  \{z )  ('»2-‘)('*2-“)+ )+  ;(s  -  s  X"2-')-  Umn-° J 

K=)’*=(5-=)K^— =-  -  IA'4 


In  substituting  from  Equations  A-36  through  A-40  in  Equation  A-43  it  suffices  for  order  h2A 
accuracy  to  use  the  approximations 


and 


and  the  result  of  such  substitution  is 


'o  *  -  ;  (Kltt0?)+«|(^;X”2-'M0+ to X""-”) 

+  +D+  -2~^n8-o»j  +  2n^0  +  2j  jm+2Q2hJ  t 

+  r  KQ+to)Tm2"<Xn2'W)+4(Q+to)^B(',2“W)+|H"+2fl® 
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+  o>^4Q2  +  16Q^  ^  )  +  ®(  )  jJ("*2-^  +  2f22^mn-a  j-f  |h/  +  4Q^  -  -(-/Gf+Qx2 

+  Qf^|[«2-o>]+|ow  +  2Q('Sll+2a  +  2/r(a  +  4Qaw  +  2 j?)Jm  +  2QniJ  .  (A-44) 

By  definition, 

x—  x  =m  +  p  , 

0—  0  =  n  +  a  , 

V  =  t—e  , 
n  -a-s  , 

<t>  =  uj—  u  , 

*  =z-p 

and 

0  =0-o  . 

Making  these  substitutions  in  Equation  A-30  and  using  Equations  A-36  through  A-40  shows 
that,  to  order  h2A, 

+  -a||2Q^9+4  j  j+4-j- Jm+20zn 

+  |«+s“('lt+20?)|(^;X"’!'')+!(fl+^X"“'0) 

*•  (k+d+ ^ -2SX“’-“M!f’ +tz(ta+zHn'-“) 
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+ 2flX  ~t  X",!-‘)(,,j-“)+2(  £ . )(° *  i  M 

+•  —  |oj(^2Q'xJ  +  4fco^n  +  ~  j  —  |Do)  +  2Q^‘jcB  +  2a  +  2/C<ijj  J  m 

—  2Qln—  J  Hw  +  2Q^9  +  5  y  j  +  2  —  |^m2-f  j-2Q*^mn-a  j 
+  [4^ Qotf+  \  j  —2  - 4KQx -  H/ 1 ( a2- w j- 2^0  +  ^  j [  rt(/»2- 1  j 

—  2  —  m( «2— a>)  — 2am  I  —  |D  +  +  4£2(  K+  —  )+2a2K+  — 

mV  /  II  co2  (•)  /  V  to 

+2QrX  £  )H”J-“)-(B+  2 +6Q£  X"2-'X"2-") 

— 2Qx^  —  jn2^m3-3mtj  — 2^  —  6ftn2+3l*^|  .  (A-45) 


By  construction 


(A-46) 


to  order  A2 A.  Substituting  from  Equations  A-44  and  A-45  for  Pq  and  PK  and  carrying  out  the 
indicated  polynomial  multiplication  in  Equation  A-46  to  this  accuracy  (meaning  also  that 
z2^  is  replaced  by  r)  results  in  a  density  of  the  form  of  Equation  A-l  for  pix.O/z)  with 
parameters 


60 


NWCTP6530 


x*  =  (w+2/f£  +  4Qxj^-A/(DL  +  2Q£j^ 


0*  =  (e+2KL^ 


,  /  3  £  \  j  rA 

Af*  =  t  =  M+4  «AfE  +  X^E+  -- J  — 

A#2+2M^HML+2KE  +  4nx  jj  ^ 

E*  =  a  -  £+2(fiAff.  +  X^A-Af^£+2£L^ 


(A-47) 


£,*  =  «  =  L  +  2^+2QL£+D£,2)^  -(e+2/H.)  * 

X*  =  X-2(xAf+QAf2I.)* 

$*  =  5+2(p+2QXL^A-|ml(dL  +  2Q£)+M$ 
+•  2(e+2£l)(x  +  QMl)|  * 


p*  =  p-  [2Mp  +  HM2L2+2(x  +  flMI,)2+4AQAft|  ~ 

The  composition  of  the  parameter  changes  for  the  four  transformations  and  conditioning 
is  just  the  sum  of  the  incremental  changes  Equations  A-6,  A- 12,  A-22,  A-23,  and  A-47  to 
order  h2  A. 
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